<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ConvertRecord fails for some files in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400506#M250846</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/119395"&gt;@tono425&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You cannot write a empty struct in parquet.&lt;/P&gt;&lt;P&gt;This is due to the way the parquet format works, a parquet file only consists of leaf field data, the intermediate structure is not stored and can be inferred using the schema and the repetition levels and definition levels of the written leaf fields. An empty struct (which is written as a group) has no leaf fields and that is why parquet fails to write this, I would suggest to change the format or filter the value before converting.&lt;/P&gt;</description>
    <pubDate>Fri, 17 Jan 2025 06:28:46 GMT</pubDate>
    <dc:creator>cloude</dc:creator>
    <dc:date>2025-01-17T06:28:46Z</dc:date>
    <item>
      <title>ConvertRecord fails for some files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400458#M250831</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;I'm trying to convert many records from json to parquet with ConvertRecord processor.&lt;BR /&gt;Most succeed, but convert fails for some files with this error.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;org.apache.parquet.schema.InvalidSchemaException: Cannot write a schema with an empty group: optional group pop_pools&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I assume this is because some json files contain the following field.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;"pop_pools": {}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As source record is log data, we can't modify it.&lt;BR /&gt;Is there any way to avoid this error and convert the records to parquet format?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jan 2025 08:06:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400458#M250831</guid>
      <dc:creator>tono425</dc:creator>
      <dc:date>2025-01-16T08:06:49Z</dc:date>
    </item>
    <item>
      <title>Re: ConvertRecord fails for some files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400506#M250846</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/119395"&gt;@tono425&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You cannot write a empty struct in parquet.&lt;/P&gt;&lt;P&gt;This is due to the way the parquet format works, a parquet file only consists of leaf field data, the intermediate structure is not stored and can be inferred using the schema and the repetition levels and definition levels of the written leaf fields. An empty struct (which is written as a group) has no leaf fields and that is why parquet fails to write this, I would suggest to change the format or filter the value before converting.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Jan 2025 06:28:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400506#M250846</guid>
      <dc:creator>cloude</dc:creator>
      <dc:date>2025-01-17T06:28:46Z</dc:date>
    </item>
    <item>
      <title>Re: ConvertRecord fails for some files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400700#M250935</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/31734"&gt;@cloude&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Thank you for your answer.&lt;BR /&gt;Now I understand that is expected behavior.&lt;BR /&gt;I'll consider solution.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jan 2025 08:20:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/ConvertRecord-fails-for-some-files/m-p/400700#M250935</guid>
      <dc:creator>tono425</dc:creator>
      <dc:date>2025-01-20T08:20:20Z</dc:date>
    </item>
  </channel>
</rss>

