Created 06-27-2022 01:00 PM
Hi,
We are having an issue reading tables in Impala with datatype Decimal (1,1). The error is:
column 'DECIMAL_1_1' has an invalid type length. Expecting: 1 len in file 4
We have two Cloudera clusters (CDH 6.3.3). This error only occurs in the 2nd cluster.
We write data into Parquet files through IBM Datastage.
I run the same job in both clusters to examine the parquet files schema using parquet-tools schema , and I found this:
In the 1st cluster, in the parquet file the schema of the column causing the issue is:
optional int32 DECIMAL_1_1 (DECIMAL(1,1));
However, in the 2nd cluster:
optional fixed_len_byte_array(4) DECIMAL_1_1 (DECIMAL(1,1));
We need to understand why the type is fixed_len_byte_array(4) in this parquet while it's int32 in the other. Apparently the fixed_len_byte_array(4) is causing the issue with Impala.
Created 06-27-2022 06:06 PM
@MeshalYou are hitting the below bug.
https://issues.apache.org/jira/browse/IMPALA-7087
Let me know if you have any concerns.
Created 06-28-2022 02:24 AM
Thanks @balajip
I have checked that page already while I was looking for a solution. But what I would like to understand why the 2 parquet files have different types for the DECIMAL(1,1), although both are created using the same job but in another cluster.
Created 07-01-2022 11:22 AM
@Meshal Thanks for your response.
please check how the table was created in both clusters? Run show create table and compare the output.
Solution:
you can get a patch for this IMPALA-7087 in CDH 6.3.3
Created 07-07-2022 12:53 PM
@Meshal Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
Regards,
Diana Torres,