Support Questions

Find answers, ask questions, and share your expertise

How to store and query a flat file containing JSON string as a part of each line, into a hive table ?

avatar
Explorer

I put into a string field in hive table and query it using get_json_object. That works for me.

But i have another set of data in HDFS like.

1023,UK,{"cities":{"city1":"London","city2":"Birmingham","city3":"Liverpool"},"universities":{"universities1":"Cambridge","universities2":"Oxford"}},07-30-2016

So i want to store it in a hive table with schema like:

create table data (SerNo int, country string , detail string,date string )

Then what should be the table definition so that {"cities: ..... } will come as one column and rest with other ? what should be the column separator ?

If i put everything as one string field in hive table , then how i query SerNo ,country and date column. Is it possible by get_json_object ?

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Have you explored json serde - https://github.com/rcongiu/Hive-JSON-Serde ??

I would write a utility script that will convert your dataset to json (inclusive of serNo, Country, cities, date) and then load them into hive using json serde

For more details on Hive Serde, refer to https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-HiveSerDe

View solution in original post

1 REPLY 1

avatar
Super Collaborator

Have you explored json serde - https://github.com/rcongiu/Hive-JSON-Serde ??

I would write a utility script that will convert your dataset to json (inclusive of serNo, Country, cities, date) and then load them into hive using json serde

For more details on Hive Serde, refer to https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-HiveSerDe