Created 02-19-2025 07:41 AM
I have a table with one column as JSON file. I am using lateral explode to parse values.The code is working fine for classes except for one where subclasses have similar variable name :
Json:
{
"Customer": {
"CustomerId": "19900101",
"appl": [
{
***Some values
}
]
"products": [
{
"ProductCOde": ABC,
"Type": C1,
"Offer": 16575.0,
"Offer": [
{
"ProductCOde": A1,
"Offer": 2000.0
"Type": CBA,
}
{
"ProductCOde": A2,
"Offer": 2000.0
"Type": DBA,
}
{
"ProductCOde": A3,
"Offer": 3000.0
"Type": BBA,
}
]
}
{
"ProductCOde": XYZ,
"Type": C2,
"Offer": 2001,
"Offer": [
{
"ProductCOde": B1,
"Offer": 2000.0
"Type": BBA,
}
{
"ProductCOde": B2,
"Offer": 2000.0
"Type": BBA,
}
]
}
]
}
}
Code:
with product as (
select
xml_data,
get_json_object(xml_data,'$.customer.customerid')as applicationid,
regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'), '\\[|\\]', ''), '\\}\\,\\{', '\\}\\;\\{') AS product_JSOn
from Custome_info
)
select
applicationid,
get_json_object(prod_item,'$.ProductCOde') as ProductCOde,
get_json_object(prod_item,'$.type') as ttype
from product
LATERAL VIEW OUTER EXPLODE(split(product_JSOn, ';')) p AS prod_item;
Output
ApplicationID ProductCode ttype
19900101 ABC NULL
19900101 A1 NULL
19900101 A2 NULL
19900101 B1 NULL
I am not able to figure why values of subclasses are populating here
Created 03-05-2025 10:30 AM
To resolve the issue where the ttype column returns NULL and subclasses' values aren't being parsed correctly, follow these steps:
The JSON keys Type and ProductCOde use uppercase letters. Correct the paths in get_json_object to match the exact case:
Use '$.Type' instead of '$.type'.
Ensure '$.ProductCOde' matches the case in the JSON (if it's a typo, adjust accordingly).
Avoid flattening both the outer products array and inner Offer array into the same lateral view. Use separate lateral views for each array level.
WITH product AS (
SELECT
xml_data,
GET_JSON_OBJECT(xml_data, '$.Customer.CustomerId') AS applicationid,
-- Extract products array as a JSON string
GET_JSON_OBJECT(xml_data, '$.Customer.products') AS products_json
FROM Custome_info
),
-- Explode the products array
exploded_products AS (
SELECT
applicationid,
product_item
FROM product
LATERAL VIEW OUTER EXPLODE(
SPLIT(
REGEXP_REPLACE(
REGEXP_REPLACE(products_json, '^\\[|\\]$', ''), -- Remove outer brackets
'\\}\\,\\{', '\\}\\;\\{' -- Split products
),
';'
)
) p AS product_item
),
-- Explode the Offer array within each product
exploded_offers AS (
SELECT
applicationid,
GET_JSON_OBJECT(product_item, '$.ProductCOde') AS ProductCode,
GET_JSON_OBJECT(product_item, '$.Type') AS ttype,
offer_item
FROM exploded_products
LATERAL VIEW OUTER EXPLODE(
SPLIT(
REGEXP_REPLACE(
REGEXP_REPLACE(
GET_JSON_OBJECT(product_item, '$.Offer'),
'^\\[|\\]$', '' -- Remove inner Offer brackets
),
'\\}\\,\\{', '\\}\\;\\{' -- Split Offer items
),
';'
)
) o AS offer_item
)
-- Combine results (products and their offers)
SELECT
applicationid,
COALESCE(
GET_JSON_OBJECT(offer_item, '$.ProductCOde'), -- Offer-level ProductCode
ProductCode -- Product-level ProductCode
) AS ProductCode,
COALESCE(
GET_JSON_OBJECT(offer_item, '$.Type'), -- Offer-level Type
ttype -- Product-level Type
) AS ttype
FROM exploded_offers;
Case Sensitivity Fix: Using '$.Type' ensures the correct extraction of the Type field from the JSON.
Nested Array Handling:
First, explode the products array to get individual product objects.
Then, explode the Offer array within each product to access sub-offer details.
Combined Results: Use COALESCE to merge product-level and offer-level values into a single output, ensuring both levels are represented.