- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
ConvertJSONtoSQL in Apache NiFi for Sending to PutHiveQL
- Labels:
-
Apache Hive
-
Apache NiFi
Created on ‎07-20-2016 03:11 PM - edited ‎08-19-2019 01:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there anything special to get this to work?
Hive Table
create table twitter( id int, handle string, hashtags string, msg string, time string, user_name string, tweet_id string, unixtime string, uuid string ) stored as orc tblproperties ("orc.compress"="ZLIB");
Data is paired down tweet:
{ "user_name" : "Tweet Person", "time" : "Wed Jul 20 15:09:42 +0000 2016", "unixtime" : "1469027382664", "handle" : "SomeTweeter", "tweet_id" : "755781737674932224", "hashtags" : "", "msg" : "RT some stuff" }
Created ‎07-30-2016 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not optimal, but this is a nice workaround:
Use ReplaceText processor
insert into twitter values (${tweet_id}, '${handle:urlEncode()}','${hashtag:urlEncode()}', '${msg:urlEncode()}','${time}', '${user_name:urlEncode()}','${tweet_id}', '${unixtime}','${uuid}')
So that's attributes in there.
I do url encode because of quotes and such. Would like a prepared statement or custom processor or call a groovy script. But this works.
Created ‎07-20-2016 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is Translate Field Names set to true? That should enable the matching of the column (which appears capitalized) against the field (which is lowercase)
Created ‎07-20-2016 03:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also if you don't care about that column you can set the Unmatched Column Behavior to warn/ignore
Created on ‎07-20-2016 03:29 PM - edited ‎08-19-2019 01:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i set unmatched columns to ignore
i tried true and false on field names
Created ‎07-20-2016 03:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had catalog and schema name and then left them off. I tried a few options. twitter is a table in default hive database
SelectHiveQL is working fine
Created ‎07-30-2016 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That did not work.
Created ‎08-01-2016 03:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't have a column called IS_AUTOINCREMENT. that's the something should be standard in JDBC. wonder if HIVE driver missing something
Created ‎07-30-2016 12:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Timothy Spann did you find a solution to this? I'm hitting the same thing with a sample 3 column hive database
Created ‎07-30-2016 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not optimal, but this is a nice workaround:
Use ReplaceText processor
insert into twitter values (${tweet_id}, '${handle:urlEncode()}','${hashtag:urlEncode()}', '${msg:urlEncode()}','${time}', '${user_name:urlEncode()}','${tweet_id}', '${unixtime}','${uuid}')
So that's attributes in there.
I do url encode because of quotes and such. Would like a prepared statement or custom processor or call a groovy script. But this works.
Created on ‎07-31-2016 02:15 PM - edited ‎08-19-2019 01:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ended up with the same workaround to get it flowing, agreed not optimal but its working!
