<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question JSON parent class extraction from Hadoop table in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/JSON-parent-class-extraction-from-Hadoop-table/m-p/405490#M252478</link>
    <description>&lt;P&gt;I need to extract a&amp;nbsp; parent JSON column present in Hadoop:&lt;/P&gt;&lt;P&gt;This is an example of my JSOn.Here I need to extract only values products class but not of offer. I cant use index of product as the the number of occurrences might change in future.&lt;/P&gt;&lt;P&gt;Have used the following regex&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;But is is only giving all values in a row , not in multiple rows.&lt;/P&gt;&lt;P&gt;Have also tried something like below, but it doesnt give correct result:&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;from prd_table&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;select &lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;get_json_object(prd_item,'$.productCode') as productCode,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;get_json_object(prd_item,'$.type') as Ttype&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;from prd&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;LATERAL VIEW OUTER EXPLODE(split(prd_JSON, ';')) p AS prd_item) a&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"Customer": {&lt;BR /&gt;"CustomerId": "19900101",&lt;BR /&gt;"appl": [&lt;BR /&gt;{&lt;BR /&gt;***Some values&lt;BR /&gt;}&lt;BR /&gt;]&lt;/P&gt;&lt;P&gt;"products": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": ABC,&lt;BR /&gt;"Type": C1,&lt;BR /&gt;"Offer": 16575.0,&lt;/P&gt;&lt;P&gt;"Offer": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": A1,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": CBA,&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"ProductCOde": A2,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": DBA,&lt;BR /&gt;}&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": A3,&lt;BR /&gt;"Offer": 3000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": XYZ,&lt;BR /&gt;"Type": C2,&lt;BR /&gt;"Offer": 2001,&lt;/P&gt;&lt;P&gt;"Offer": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": B1,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"ProductCOde": B2,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone point out what I am doing wrong here&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 21 Apr 2026 06:20:05 GMT</pubDate>
    <dc:creator>Rich_Learner</dc:creator>
    <dc:date>2026-04-21T06:20:05Z</dc:date>
    <item>
      <title>JSON parent class extraction from Hadoop table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/JSON-parent-class-extraction-from-Hadoop-table/m-p/405490#M252478</link>
      <description>&lt;P&gt;I need to extract a&amp;nbsp; parent JSON column present in Hadoop:&lt;/P&gt;&lt;P&gt;This is an example of my JSOn.Here I need to extract only values products class but not of offer. I cant use index of product as the the number of occurrences might change in future.&lt;/P&gt;&lt;P&gt;Have used the following regex&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;But is is only giving all values in a row , not in multiple rows.&lt;/P&gt;&lt;P&gt;Have also tried something like below, but it doesnt give correct result:&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;regexp_replace(regexp_replace(get_json_object(xml_data, '$.customer.products'),'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') AS prd_JSON&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;from prd_table&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;select &lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;get_json_object(prd_item,'$.productCode') as productCode,&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;get_json_object(prd_item,'$.type') as Ttype&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;from prd&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif" color="#0000FF"&gt;LATERAL VIEW OUTER EXPLODE(split(prd_JSON, ';')) p AS prd_item) a&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"Customer": {&lt;BR /&gt;"CustomerId": "19900101",&lt;BR /&gt;"appl": [&lt;BR /&gt;{&lt;BR /&gt;***Some values&lt;BR /&gt;}&lt;BR /&gt;]&lt;/P&gt;&lt;P&gt;"products": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": ABC,&lt;BR /&gt;"Type": C1,&lt;BR /&gt;"Offer": 16575.0,&lt;/P&gt;&lt;P&gt;"Offer": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": A1,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": CBA,&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"ProductCOde": A2,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": DBA,&lt;BR /&gt;}&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": A3,&lt;BR /&gt;"Offer": 3000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": XYZ,&lt;BR /&gt;"Type": C2,&lt;BR /&gt;"Offer": 2001,&lt;/P&gt;&lt;P&gt;"Offer": [&lt;BR /&gt;{&lt;BR /&gt;"ProductCOde": B1,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"ProductCOde": B2,&lt;BR /&gt;"Offer": 2000.0&lt;BR /&gt;"Type": BBA,&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone point out what I am doing wrong here&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 06:20:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/JSON-parent-class-extraction-from-Hadoop-table/m-p/405490#M252478</guid>
      <dc:creator>Rich_Learner</dc:creator>
      <dc:date>2026-04-21T06:20:05Z</dc:date>
    </item>
    <item>
      <title>Re: JSON parent class extraction from Hadoop table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/JSON-parent-class-extraction-from-Hadoop-table/m-p/407992#M252645</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/123988"&gt;@Rich_Learner&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;can you try this:&lt;/P&gt;&lt;P&gt;"SELECT&lt;BR /&gt;get_json_object(product_json, '$.ProductCOde') AS product_code,&lt;BR /&gt;get_json_object(product_json, '$.Type') AS product_type&lt;BR /&gt;FROM customer_table&lt;BR /&gt;LATERAL VIEW&lt;BR /&gt;json_tuple(json_column, 'Customer') c AS customer_json&lt;BR /&gt;LATERAL VIEW&lt;BR /&gt;json_tuple(customer_json, 'products') p AS products_json&lt;BR /&gt;LATERAL VIEW&lt;BR /&gt;explode(from_json(products_json, 'array&amp;lt;map&amp;lt;string,string&amp;gt;&amp;gt;')) product_table AS product_json&lt;BR /&gt;;""&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;OR&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WITH cleaned_json AS (&lt;BR /&gt;SELECT&lt;BR /&gt;regexp_replace(&lt;BR /&gt;regexp_replace(&lt;BR /&gt;get_json_object(json_column, '$.Customer.products'),&lt;BR /&gt;'\\}\\s*,\\s*\\{',&lt;BR /&gt;'}~{'&lt;BR /&gt;),&lt;BR /&gt;'\\[|\\]',&lt;BR /&gt;''&lt;BR /&gt;) AS flat_products&lt;BR /&gt;FROM customer_table&lt;BR /&gt;),&lt;BR /&gt;split_json AS (&lt;BR /&gt;SELECT split(flat_products, '~') AS product_array&lt;BR /&gt;FROM cleaned_json&lt;BR /&gt;)&lt;BR /&gt;SELECT&lt;BR /&gt;get_json_object(item, '$.ProductCOde') AS product_code,&lt;BR /&gt;get_json_object(item, '$.Type') AS product_type&lt;BR /&gt;FROM split_json&lt;BR /&gt;LATERAL VIEW explode(product_array) exploded_table AS item;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ensure your JSON keys match case-sensitively and&amp;nbsp;Use consistent JSON structure. If offer&amp;nbsp;is both a number and an array in different objects, consider preprocessing or cleaning up such inconsistencies.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Chethan YM&lt;/P&gt;</description>
      <pubDate>Mon, 05 May 2025 13:35:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/JSON-parent-class-extraction-from-Hadoop-table/m-p/407992#M252645</guid>
      <dc:creator>ChethanYM</dc:creator>
      <dc:date>2025-05-05T13:35:14Z</dc:date>
    </item>
  </channel>
</rss>

