Created 07-12-2017 06:59 PM
I am trying to analyse the JSON twitter data using get_json_object() but the result is coming as null
This is the json code:
{
"text": "RT @PostGradProblem: In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during ...",
"truncated": true,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for iPhone</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54691802283900928",
"entities": {
"user_mentions": [
{
"indices": [
3,
19
],
"screen_name": "PostGradProblem",
"id_str": "271572434",
"name": "PostGradProblems",
"id": 271572434
}
],
"urls": [ ],
"hashtags": [ ]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 23:48:36 +0000 2011",
"retweeted_status": {
"text": "In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during company time. #PGP",
"truncated": false,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://www.hootsuite.com\" rel=\"nofollow\">HootSuite</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54640519019642881",
"entities": {
"user_mentions": [ ],
"urls": [ ],
"hashtags": [
{
"text": "PGP",
"indices": [
130,
134
]
}
]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 20:24:49 +0000 2011",
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 31,
"profile_background_color": "C0DEED",
"followers_count": 3066,
"profile_image_url": "http://a2.twimg.com/profile_images/1285770264/PGP_normal.jpg",
"listed_count": 6,
"profile_background_image_url": "http://a3.twimg.com/a/1301071706/images/themes/theme1/bg.png",
"description": "",
"screen_name": "PostGradProblem",
"default_profile": true,
"verified": false,
"time_zone": null,
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "",
"id_str": "271572434",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 21,
"protected": false,
"favourites_count": 0,
"created_at": "Thu Mar 24 19:45:44 +0000 2011",
"profile_link_color": "0084B4",
"name": "PostGradProblems",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 271572434,
"contributors_enabled": false,
"following": null,
"utc_offset": null
},
"id": 54640519019642880,
"coordinates": null,
"geo": null
},
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 351,
"profile_background_color": "C0DEED",
"followers_count": 48,
"profile_image_url": "http://a1.twimg.com/profile_images/455128973/gCsVUnofNqqyd6tdOGevROvko1_500_normal.jpg",
"listed_count": 0,
"profile_background_image_url": "http://a3.twimg.com/a/1300479984/images/themes/theme1/bg.png",
"description": "watcha doin in my waters?",
"screen_name": "OldGREG85",
"default_profile": true,
"verified": false,
"time_zone": "Hawaii",
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "Texas",
"id_str": "80177619",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 81,
"protected": false,
"favourites_count": 0,
"created_at": "Tue Oct 06 01:13:17 +0000 2009",
"profile_link_color": "0084B4",
"name": "GG",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 80177619,
"contributors_enabled": false,
"following": null,
"utc_offset": -36000
},
This is my code:
Create table json_table(json string);
load data local inpath 'Desktop/new' into table json_table;
select * from json_table; (it is giving the stored data result)
select get_json_object(json, '$.text')as text from json_table;
NULL
NULL
NULL
NULL....
Created 07-12-2017 08:07 PM
You are using single column but the data has more columns it seems. please make sure that HIVE Table columns are matching with the .json file
Refer the below link, it has some examples. It may help you
Created 07-13-2017 05:35 AM
Thank you, sir, I am new to Cloudera Hadoop, So might be my questions are bit silly, I am trying to store the serde which you have suggested me in hive/lib. but I don't have access to save in the hive, so how can I get the access to store serde in hive/lib.
Created 07-13-2017 01:34 PM
Created 07-13-2017 05:06 PM
I have downloaded JSON serde which you suggested me, I stored the serde on Cloudera desktop, from there I tried to copy the serde into hive/lib older, but I could not save the serde file in hive/lib. below I am attaching a screen shot, there you can find the paste option is disabled. So could you please let me how to get permissions to store the serde file in hive/lib folder.
Paste option disabled
Created 07-14-2017 02:52 AM
Your screenshot is not visible mate.
attach using Photos tool in the toolbar
Created 07-14-2017 04:35 AM
Thanks for your reply sir, I have uploaded the screen shot using photo tool bar if you click on the photo, then it will give a clear view. Waiting for a solution for my problem Thanks in advance.
Created 07-14-2017 06:01 AM
I belive you are runining Linux os
Perform this under hive/lib
ls -lrt /etc/hive/lib
check the permission .
or Login as Root user perform CP command to the /etc/hive/lib
and change the permission accordingly chown / chmod operation .
Let me know if you need more information
Created 07-14-2017 07:38 AM
I didn't get a chance to explore more, but try to create table with "ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'" as follows. It is allowing me to create a table (without the jar installation, i think the jar might be inbuild with some version).
If this works, you can ignore the installation part and focus on loading json file. you can try to load a small file as follows
CREATE TABLE test_tbl
(
col1 STRING,
col2 STRING
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE;
file1.json
{"col1":100, "col2":"a1c"} {"col1":101, "col2":"xyz"} {"col1":102, "col2":"abc"}
Created 07-14-2017 10:18 AM
I have used the same code but it is giving an error, below I am attaching the screenshot kindly help me out from this problem.