I am trying to process a sample Json file. It's a very basic file of twitter data. I found that I should use a JsonSerDe. I did not know what this was so I looked it up. I know what a SerDe is now (I'm a newbie if you can't tell). How can I tell if a SerDe is installed on the HDP 2.4 sandbox?
The SerDe I want to use is org.apache.hcatalog.data.JsonSerDe but when I execute the following, I get errors saying "cannot recognize input near ':' 'string' ',' in column type"
Here's what I'm trying to execute:
CREATE EXTERNAL TABLE twitter_sample ( tweetmessage:string, createdate:date, 'user' struct< screenname:string, userlocation:string>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/root/twitter_txt_results';
Like I said, I'm a newbie, not only to Hive but also to Linux. I tried searching for anything that had the wording JsonSerDe in it but the search came back blank (it could be I didn't do the search correctly). I have a feeling however that the serde isn't installed.
Can anyone help?
That worked Grant. Thank you. I'm reading through the link you sent as well, however I'm very curious. Why would the word user not work and c3 would? I tried this:
CREATE EXTERNAL TABLE twitter_sample ( tweetmessage string, createdate date, user struct< screenname:string, userlocation:string >) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
and it failed. Then I changed user --> c3 and then it executed properly.
'user' is a keyword in hive, so with default settings they can't be part of columns and table names. There are ways to override it to accept keywords, but it is not recommended to that since you can run into other issues.