Support Questions

Find answers, ask questions, and share your expertise

ConvertJSONtoSQL in Apache NiFi for Sending to PutHiveQL

avatar
Master Guru

5897-diagram.png

5896-convertjsontosql.png

Is there anything special to get this to work?

Hive Table

create table
twitter(
  id int,
  handle string,
  hashtags string,
  msg string,
  time string,
  user_name string,
  tweet_id string,
  unixtime string,
  uuid string
) stored as orc
tblproperties ("orc.compress"="ZLIB");

Data is paired down tweet:

{ "user_name" : "Tweet Person", "time" : "Wed Jul 20 15:09:42 +0000 2016", "unixtime" : "1469027382664", "handle" : "SomeTweeter", "tweet_id" : "755781737674932224", "hashtags" : "", "msg" : "RT some stuff" }

1 ACCEPTED SOLUTION

avatar
Master Guru
@dpinkston

Not optimal, but this is a nice workaround:

Use ReplaceText processor

insert into twitter values (${tweet_id}, '${handle:urlEncode()}','${hashtag:urlEncode()}', '${msg:urlEncode()}','${time}', '${user_name:urlEncode()}','${tweet_id}', '${unixtime}','${uuid}')

So that's attributes in there.

I do url encode because of quotes and such. Would like a prepared statement or custom processor or call a groovy script. But this works.

View solution in original post

17 REPLIES 17

avatar
Master Guru

Is Translate Field Names set to true? That should enable the matching of the column (which appears capitalized) against the field (which is lowercase)

avatar
Master Guru

Also if you don't care about that column you can set the Unmatched Column Behavior to warn/ignore

avatar
Master Guru

i set unmatched columns to ignore

i tried true and false on field names

5911-hivematch.png

avatar
Master Guru

I had catalog and schema name and then left them off. I tried a few options. twitter is a table in default hive database

SelectHiveQL is working fine

avatar
Master Guru

That did not work.

avatar
Master Guru

I don't have a column called IS_AUTOINCREMENT. that's the something should be standard in JDBC. wonder if HIVE driver missing something

avatar
Rising Star

@Timothy Spann did you find a solution to this? I'm hitting the same thing with a sample 3 column hive database

avatar
Master Guru
@dpinkston

Not optimal, but this is a nice workaround:

Use ReplaceText processor

insert into twitter values (${tweet_id}, '${handle:urlEncode()}','${hashtag:urlEncode()}', '${msg:urlEncode()}','${time}', '${user_name:urlEncode()}','${tweet_id}', '${unixtime}','${uuid}')

So that's attributes in there.

I do url encode because of quotes and such. Would like a prepared statement or custom processor or call a groovy script. But this works.

avatar
Rising Star

I ended up with the same workaround to get it flowing, agreed not optimal but its working!

6203-puthiveql-replacetext.jpg