Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Loading CSV to Impala fills table with Null values

SOLVED Go to solution
Highlighted

Loading CSV to Impala fills table with Null values

Explorer

Hello all,

 

I'm trying to do a bulk load from a CSV file to a table on Impala. Table have the same fields as my CSV file and I'm using the following command to load it:

 

LOAD DATA INPATH '/user/myuser/data/file.csv' INTO TABLE my_database.my_table;

The path is HDFS path and my file uses \t as separator.

 

When I execute the instruction, everything seems to be okay. After that I query for count(*) and I've exact the same number of rows that lines I had in my file, but when I do a SELECT, all rows and fields are NULL.


I readed in Cloudera documentation that:

"If a text file has fewer fields than the columns in the corresponding Impala table, all the corresponding columns are set to NULL when the data in that file is read by an Impala query."

 

But since I have the same number of columns I don't know which is the problem in here. Anybody has idea or possibles solutions?

 

Thanks you so muh in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Loading CSV to Impala fills table with Null values

Guru
This looks like that the table in Impala is not field delimited by tab.

I suggest you re-create the table with following statement:

CREATE TABLE my_table (a int, b int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

Basically you need to have "ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'" in the table definition, so that impala/hive knows what the delimiter is, otherwise, the default Ctrl-A (hex 01) character will be used.
3 REPLIES 3

Re: Loading CSV to Impala fills table with Null values

Guru
This looks like that the table in Impala is not field delimited by tab.

I suggest you re-create the table with following statement:

CREATE TABLE my_table (a int, b int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

Basically you need to have "ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'" in the table definition, so that impala/hive knows what the delimiter is, otherwise, the default Ctrl-A (hex 01) character will be used.

Re: Loading CSV to Impala fills table with Null values

Explorer

I thought I had tried this before but seems like I didnt it in the right way. Now I tried once again and it worked. Thanks you so much and sorry for the stupid question.

Re: Loading CSV to Impala fills table with Null values

Guru
Nah, nothing is stupid, they are all questions that lots of people will face one day.

Glad that I am helpful here :).