Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Error on Creating External Table from CSV in Impala after latest CDP Update

avatar
New Contributor

Hi,

 

I am getting error when I am querying my table in impala editor.

The simple query:

select * from table GeoIP2_ISP_Blocks_IPv4;

The error:

Neel_Sharma_1-1659357815162.png

The table creation query is give below:

CREATE External TABLE GeoIP2_ISP_Blocks_IPv4 (
Client string,
ID int,
Number int,
Name string,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
WITH SERDEPROPERTIES (
"separatorChar" = "\,",
"quoteChar" = "\""
)

LOCATION 'hdfs:///user/hive/warehouse/tips/'
tblproperties ("skip.header.line.count"="1");

 

The same error is coming when I am creating external tables from tab delimited files. All this is happening after latest CDP Update.

 

Please someone help me it would be really helpful.

1 ACCEPTED SOLUTION

avatar

Hi @Neel_Sharma , The message suggests that the query tried to read the table's datafiles as if the table was parquet file based. (and parquet might be the default table format in Hive in CDP - of course only if the table format is not specified during creation)
However the table creation script you've shared suggests the table should be text (CSV) based.

Can you please verify it with checking what is the table format, with:
DESCRIBE FORMATTED GeoIP2_ISP_Blocks_IPv4;

Are you in the right database?

For the second issue - how do you create the external tables from tab delimited files? How are the files uploaded to hdfs?

Thanks

 Miklos

View solution in original post

2 REPLIES 2

avatar

Hi @Neel_Sharma , The message suggests that the query tried to read the table's datafiles as if the table was parquet file based. (and parquet might be the default table format in Hive in CDP - of course only if the table format is not specified during creation)
However the table creation script you've shared suggests the table should be text (CSV) based.

Can you please verify it with checking what is the table format, with:
DESCRIBE FORMATTED GeoIP2_ISP_Blocks_IPv4;

Are you in the right database?

For the second issue - how do you create the external tables from tab delimited files? How are the files uploaded to hdfs?

Thanks

 Miklos

avatar
New Contributor

The issue is resolved by adding Stored as textfile in the table creation query.

CREATE External TABLE GeoIP2_ISP_Blocks_IPv4 (
Client string,
ID int,
Number int,
Name string,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

Stored as textfile    <---------------------- this new line needs to be added
WITH SERDEPROPERTIES (
"separatorChar" = "\,",
"quoteChar" = "\""
)

LOCATION 'hdfs:///user/hive/warehouse/tips/'
tblproperties ("skip.header.line.count"="1");

 

Cloudera has changed the default file format from txt to parquet. So we need to specify this line for text files.

 

Thanks @mszurap  for the help.