Created 11-02-2016 09:14 PM
I want to load a json record into hive
{"id":"790657073453424645","user_friends_count":{"int":121},"user_location":{"string":"Europa"},"user_description":{"string":"MITMACHEN \r\n\r\nIm Kampf gegen die EntDemokratisierung durch Freihandelsabkommen! \r\n\r\nStop TTIP - Stop TAFTA\r\n\r\nThe Fight against USA TTIP !"},"user_statuses_count":{"int":7561},"user_followers_count":{"int":1380},"user_name":{"string":"Freihandelsabkommen"},"user_screen_name":{"string":"Stop_TTIP"},"created_at":{"string":"2016-10-24T16:51:50Z"},"text":{"string":"RT @alikonkret: Da wurde die Meinungsmache im Kommentar versteckt um die scheinbare Neutralität zu wahren. #CETA #Wallonia https://t.co/ViA…"},"retweet_count":{"long":0},"retweeted":{"boolean":true},"in_reply_to_user_id":{"long":-1},"source":{"string":"<a href=\"http://www.tweetcaster.com\" rel=\"nofollow\">TweetCaster for Android</a>"},"in_reply_to_status_id":{"long":-1},"media_url_https":null,"expanded_url":null}
I only have the skeleton command
CREATE EXTERNAL TABLE tweetdata3( ) ROW FORMAT DELIMITED Fields terminated by ',' STORED as textfile location '/user/flume/tweets';
Created 11-02-2016 11:33 PM
You would need the JSON SerDe driver for Hive in order to make your JSON data to Hive tables. The module is available below: https://github.com/rcongiu/Hive-JSON-Serde
Created 11-03-2016 01:58 PM
followed the steps but getting all NULLS .
I compiled the serde and copied the json-serde-1.3.8-SNAPSHOT.jar file to the $FLUME_HOME/lib folder.
hive> CREATE EXTERNAL TABLE tweetdata3 ( > id string, > person struct<email:string, first_name:string, last_name:string, location:struct<address:string, city:string, state:string, zipcode:string>, text:string, url:string>) > ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' > LOCATION '/user/flume/tweets'; OK Time taken: 0.197 seconds hive> desc tweetdata3; OK id string from deserializer person struct<email:string,first_name:string,last_name:string,location:struct<address:string,city:string,state:string,zipcode:string>,text:string,url:string> from deserializer Time taken: 0.266 seconds, Fetched: 2 row(s) hive> > SELECT id, person.first_name, person.last_name, person.email, > person.location.address, person.location.city, person.location.state, > person.location.zipcode, person.text, person.url > FROM tweetdata3 LIMIT 5; OK 790657073453424645 NULL NULL NULL NULL NULL NULL NULL NULL NULL 790657073453424645 NULL NULL NULL NULL NULL NULL NULL NULL NULL Time taken: 0.282 seconds, Fetched: 2 row(s) hive> SELECT id, person.first_name, person.last_name, person.email, > person.location.address, person.location.city, person.location.state, > person.location.zipcode, person.text, person.url > FROM tweetdata3 LIMIT 5; OK 790657073453424645 NULL NULL NULL NULL NULL NULL NULL NULL NULL 790657073453424645 NULL NULL NULL NULL NULL NULL NULL NULL NULL Time taken: 0.063 seconds, Fetched: 2 row(s)
Created 11-03-2016 12:03 AM
Created 11-03-2016 02:12 PM
I am confused , shouldn't the twitter data be same for everyone ? I am looking at the links you have mentioned here and the twitter data is different everywhere ?
looking at my data above please advise if its the right twitter record and if not why I am getting this format ?
Created 01-14-2019 11:00 AM
check out this article
https://medium.com/datadriveninvestor/analyzing-twitter-feeds-using-hive-7e074025f295
Created 01-14-2019 03:30 PM
Check out this article , to map JSON to hive columns
https://medium.com/datadriveninvestor/analyzing-twitter-feeds-using-hive-7e074025f295