<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Parquet Decimal Datatype Returning an Error in Impala in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346399#M234866</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are having an issue reading tables in Impala with datatype Decimal (1,1). The error is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#FF0000"&gt;column 'DECIMAL_1_1' has an invalid type length. Expecting: 1 len in file 4&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;We have two Cloudera clusters (CDH 6.3.3). This error only occurs in the 2nd cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We write data into Parquet files through IBM Datastage.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I run the same job in both clusters to examine the parquet files schema using parquet-tools schema , and I found this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the 1st cluster, in the parquet file the schema of the column causing the issue is:&lt;/P&gt;&lt;P&gt;optional &lt;STRONG&gt;int32&lt;/STRONG&gt; DECIMAL_1_1 (DECIMAL(1,1));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, in the 2nd cluster:&lt;/P&gt;&lt;P&gt;optional &lt;STRONG&gt;fixed_len_byte_array(4)&lt;/STRONG&gt; DECIMAL_1_1 (DECIMAL(1,1));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We need to understand why the type is&amp;nbsp;&lt;STRONG&gt;fixed_len_byte_array(4)&amp;nbsp;&lt;/STRONG&gt;in this parquet while it's&amp;nbsp;&lt;STRONG&gt;int32&amp;nbsp;&lt;/STRONG&gt;in the other. Apparently the&amp;nbsp;&lt;STRONG&gt;fixed_len_byte_array(4)&amp;nbsp;&lt;/STRONG&gt;is causing the issue with Impala.&lt;/P&gt;</description>
    <pubDate>Mon, 27 Jun 2022 20:00:24 GMT</pubDate>
    <dc:creator>Meshal</dc:creator>
    <dc:date>2022-06-27T20:00:24Z</dc:date>
    <item>
      <title>Parquet Decimal Datatype Returning an Error in Impala</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346399#M234866</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are having an issue reading tables in Impala with datatype Decimal (1,1). The error is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#FF0000"&gt;column 'DECIMAL_1_1' has an invalid type length. Expecting: 1 len in file 4&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;We have two Cloudera clusters (CDH 6.3.3). This error only occurs in the 2nd cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We write data into Parquet files through IBM Datastage.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I run the same job in both clusters to examine the parquet files schema using parquet-tools schema , and I found this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the 1st cluster, in the parquet file the schema of the column causing the issue is:&lt;/P&gt;&lt;P&gt;optional &lt;STRONG&gt;int32&lt;/STRONG&gt; DECIMAL_1_1 (DECIMAL(1,1));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, in the 2nd cluster:&lt;/P&gt;&lt;P&gt;optional &lt;STRONG&gt;fixed_len_byte_array(4)&lt;/STRONG&gt; DECIMAL_1_1 (DECIMAL(1,1));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We need to understand why the type is&amp;nbsp;&lt;STRONG&gt;fixed_len_byte_array(4)&amp;nbsp;&lt;/STRONG&gt;in this parquet while it's&amp;nbsp;&lt;STRONG&gt;int32&amp;nbsp;&lt;/STRONG&gt;in the other. Apparently the&amp;nbsp;&lt;STRONG&gt;fixed_len_byte_array(4)&amp;nbsp;&lt;/STRONG&gt;is causing the issue with Impala.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Jun 2022 20:00:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346399#M234866</guid>
      <dc:creator>Meshal</dc:creator>
      <dc:date>2022-06-27T20:00:24Z</dc:date>
    </item>
    <item>
      <title>Re: Parquet Decimal Datatype Returning an Error in Impala</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346418#M234868</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/98837"&gt;@Meshal&lt;/a&gt;&lt;SPAN&gt;You are hitting the below bug.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/IMPALA-7087" target="_blank"&gt;https://issues.apache.org/jira/browse/IMPALA-7087&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Let me know if you have any concerns.&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jun 2022 01:06:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346418#M234868</guid>
      <dc:creator>balajip</dc:creator>
      <dc:date>2022-06-28T01:06:13Z</dc:date>
    </item>
    <item>
      <title>Re: Parquet Decimal Datatype Returning an Error in Impala</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346435#M234870</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38961"&gt;@balajip&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have checked that page already while I was looking for a solution. But what I would like to understand why the 2 parquet files have different types for the&amp;nbsp;&lt;SPAN&gt;DECIMAL(1,1), although both are created using the same job but in another cluster.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jun 2022 09:24:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346435#M234870</guid>
      <dc:creator>Meshal</dc:creator>
      <dc:date>2022-06-28T09:24:56Z</dc:date>
    </item>
    <item>
      <title>Re: Parquet Decimal Datatype Returning an Error in Impala</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346681#M234963</link>
      <description>&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/98837"&gt;@Meshal&lt;/a&gt;&amp;nbsp;Thanks for your response.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;please check how the table&amp;nbsp;was created in both clusters? Run show create table and compare the output.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Solution:&lt;BR /&gt;&lt;SPAN&gt;you can get a patch for this&amp;nbsp;&lt;A href="https://issues.apache.org/jira/browse/IMPALA-7087" target="_blank" rel="nofollow noopener noreferrer"&gt;IMPALA-7087&lt;/A&gt; in CDH 6.3.3&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jul 2022 18:22:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/346681#M234963</guid>
      <dc:creator>balajip</dc:creator>
      <dc:date>2022-07-01T18:22:05Z</dc:date>
    </item>
    <item>
      <title>Re: Parquet Decimal Datatype Returning an Error in Impala</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/347084#M235065</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/98837"&gt;@Meshal&lt;/a&gt;&amp;nbsp;Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 19:53:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parquet-Decimal-Datatype-Returning-an-Error-in-Impala/m-p/347084#M235065</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2022-07-07T19:53:17Z</dc:date>
    </item>
  </channel>
</rss>

