Support Questions
Find answers, ask questions, and share your expertise

ConvertJSONtoSQL in Apache NiFi for Sending to PutHiveQL

Super Guru

5897-diagram.png

5896-convertjsontosql.png

Is there anything special to get this to work?

Hive Table

create table
twitter(
  id int,
  handle string,
  hashtags string,
  msg string,
  time string,
  user_name string,
  tweet_id string,
  unixtime string,
  uuid string
) stored as orc
tblproperties ("orc.compress"="ZLIB");

Data is paired down tweet:

{ "user_name" : "Tweet Person", "time" : "Wed Jul 20 15:09:42 +0000 2016", "unixtime" : "1469027382664", "handle" : "SomeTweeter", "tweet_id" : "755781737674932224", "hashtags" : "", "msg" : "RT some stuff" }

17 REPLIES 17

Contributor

I used this method, but it is very slow, how about yours?

Super Guru

it wasn't slow. I will try in NiFI 1.0

Contributor

I spent one day to insert 7000 rows data into hive, but I have more than 800 million rows.

Super Guru

if you have that many rows you need to go parallel and run on multiple nodes. You should probably trigger a Sqoop job or Spark SQL job from NiFi. have a few nodes running at once.

Super Guru

store to HDFS as ORC and then create HIVE table ontop of it.

I did 600,000 rows on a 4 GB machine and did that in a few minutes

Contributor

thanks for your reply. do you have a example for details?

Super Guru

Super Guru

I confirmed this to be a bug in ConvertJSONToSQL, I have written up NIFI-4071, please see the Jira for details.