Support Questions

Find answers, ask questions, and share your expertise

JSON parent class extraction from Hadoop table

avatar

I need to extract a  parent JSON column present in Hadoop:

This is an example of my JSOn.Here I need to extract only values products class but not of offer. I cant use index of product as the the number of occurrences might change in future.

Have used the following regex 

regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON

But is is only giving all values in a row , not in multiple rows.

Have also tried something like below, but it doesnt give correct result:

regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON
from prd_table
)
select

get_json_object(prd_item,'$.productCode') as productCode,
get_json_object(prd_item,'$.type') as Ttype
from prd

LATERAL VIEW OUTER EXPLODE(split(prd_JSON, ';')) p AS prd_item) a

 

{
"Customer": {
"CustomerId": "19900101",
"appl": [
{
***Some values
}
]

"products": [
{
"ProductCOde": ABC,
"Type": C1,
"Offer": 16575.0,

"Offer": [
{
"ProductCOde": A1,
"Offer": 2000.0
"Type": CBA,
}

{
"ProductCOde": A2,
"Offer": 2000.0
"Type": DBA,
}
{
"ProductCOde": A3,
"Offer": 3000.0
"Type": BBA,
}
]
}
{
"ProductCOde": XYZ,
"Type": C2,
"Offer": 2001,

"Offer": [
{
"ProductCOde": B1,
"Offer": 2000.0
"Type": BBA,
}

{
"ProductCOde": B2,
"Offer": 2000.0
"Type": BBA,
}
]
}
]
}
}

 

Can anyone point out what I am doing wrong here

 

0 REPLIES 0