Member since
07-09-2024
1
Post
0
Kudos Received
0
Solutions
07-09-2024
12:43 PM
Hi all - I am trying to create a Hive table from nested JSON parquet data. The problem is one object is dynamic and I want to store it as a string since it changes. Example JSON: { "level1": { "level2": { "key1": "someString", "level3": { "level4": { "key2": 1234, "level5": [ { "this": 1, "changes": false, "each": 12345, "item": 0, }, { "something": 12345, "new": true, "here": [ 123 ] } ] } } } } } Here is what I have tried: CREATE EXTERNAL TABLE IF NOT EXISTS my_table ( level1 STRUCT<, level2 STRUCT<, key1 STRING, level3 STRUCT<, level4 STRUCT<, key2 BIGINT, level5 ARRAY<STRING>>>>> ) STORED AS PARQUET LOCATION '/my/json/parquet/' TBLPROPERTIES ("parquet.compression"="SNAPPY"); This will successfully create the table. I can query down all levels except "level5" falls apart. Is there a way I can cast the array in level5 into a string since it always changes?? FYI I have tried ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe', but this does not mix with parquet format. Please help!
... View more
Labels:
- Labels:
-
Apache Hive