- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
I have a HIVE table which contains JSON as a column. I need to extract a struct in this JSON .But Lateral explode is not helping.
- Labels:
-
Apache Hive
Created 02-12-2025 01:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The JSON is example is :
{
"app": {
"Id": "123",
"apply": [
{
"flag": "1",
"Product": [
{
"Code": "DZ9e",
"Line": "10000",
"status": "C",
}
{
"Code": "EZ9e",
"Line": "19000",
"status": "C",
}
]
}
}
I need output like
ID | Flag | Code | Line | Status |
123 | 1 | DZ9e | 10000 | C |
123 | 1 | EZ9e | 19000 | C |
I have used lateral explode/lalteral inline explode . But it is not giving value in separate rows.
Anybody has any solution
Created 02-17-2025 06:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ahh
@Rich_Learner please use the following query. I just tested. It should work.
WITH json_extract AS (
SELECT
get_json_object(xml_data, '$.app.Id') AS ID,
get_json_object(xml_data, '$.app.apply[0].flag') AS Flag,
regexp_replace(regexp_replace(get_json_object(xml_data, '$.app.apply[0].Product'), '\\[|\\]', ''), '\\}\\,\\{', '\\}\\;\\{') AS products
FROM check
)
SELECT
ID,
Flag,
get_json_object(product_data, '$.Code') AS Code,
get_json_object(product_data, '$.Line') AS Line,
get_json_object(product_data, '$.status') AS Status
FROM json_extract
LATERAL VIEW explode(split(products, ';')) p AS product_data;
Created 02-12-2025 11:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Rich_Learner, Welcome to our community! To help you get the best possible answer, I have tagged our Hive experts @smruti who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 02-13-2025 01:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Rich_Learner could you please share your table definition?
show create table <table_name>
I wanted to see if you are using textfile or json serde. Also I could try replicating the issue with that info.
Created 02-13-2025 01:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi I am attaching the result for a similar example:
column xml_data contains json
Created 02-13-2025 02:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try something like this :
WITH json_extract AS (
SELECT
get_json_object(xml_data, '$.app.Id') AS ID,
get_json_object(xml_data, '$.app.apply[0].flag') AS Flag,
get_json_object(xml_data, '$.app.apply[0].Product') AS products_json
FROM check
)
SELECT
ID,
Flag,
get_json_object(product_item, '$.Code') AS Code,
get_json_object(product_item, '$.Line') AS Line,
get_json_object(product_item, '$.status') AS Status
FROM json_extract
LATERAL VIEW OUTER EXPLODE(split(regexp_replace(regexp_replace(products_json, '\\[|\\]', ''), '\\}\\s*,\\s*\\{', '}|{'), '|')) p AS product_item;
Created 02-13-2025 04:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the solution.
Right the code is creating separate columns . But I am getting around 143 rows whereas technically there should be 3 rows and value is coming as NULL.
here consider first column as id
Created 02-13-2025 05:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here Phonetype and Phone availabilityflag are my array which I want in rows
Created 02-13-2025 08:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try :
WITH json_extract AS (
SELECT
get_json_object(xml_data, '$.app.Id') AS ID,
get_json_object(xml_data, '$.app.apply[0].flag') AS Flag,
get_json_object(xml_data, '$.app.apply[0].Product') AS products
FROM check
)
SELECT
ID,
Flag,
get_json_object(product_data, '$.Code') AS Code,
get_json_object(product_data, '$.Line') AS Line,
get_json_object(product_data, '$.status') AS Status
FROM json_extract
LATERAL VIEW OUTER EXPLODE(SPLIT(products, '},')) p AS product_data;
Created 02-14-2025 06:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @smruti .
I am getting 3 rows now but the first two values as populated as NULL only the last values is correctly populated.
Created 02-17-2025 06:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ahh
@Rich_Learner please use the following query. I just tested. It should work.
WITH json_extract AS (
SELECT
get_json_object(xml_data, '$.app.Id') AS ID,
get_json_object(xml_data, '$.app.apply[0].flag') AS Flag,
regexp_replace(regexp_replace(get_json_object(xml_data, '$.app.apply[0].Product'), '\\[|\\]', ''), '\\}\\,\\{', '\\}\\;\\{') AS products
FROM check
)
SELECT
ID,
Flag,
get_json_object(product_data, '$.Code') AS Code,
get_json_object(product_data, '$.Line') AS Line,
get_json_object(product_data, '$.status') AS Status
FROM json_extract
LATERAL VIEW explode(split(products, ';')) p AS product_data;
