Support Questions
Find answers, ask questions, and share your expertise

separatorChar setting ignored ?

Solved Go to solution
Highlighted

separatorChar setting ignored ?

New Contributor

Hi,

 

I am using the following code to create a csv file as result of a select statement :

 

CREATE TABLE Tablename
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ("separatorChar" = "\t","quoteChar"="\"","escapeChar"="\\")    
STORED AS TEXTFILE
LOCATION '[path]/filename'
AS
SELECT ... ;

 

I tried to use "separatorChar" = "|" as well, but that seems not to work either.

 

I always get a result that uses comma as separator. Does anybody see any typo or error in my statement or does anybody know about problems with using the separatorChar setting of the CSV Serde ?

 

Example line of resulting csv file :

"1","Steven","12345678","2014-08-06","2014-08-06","1.0","0.0","PC","","","","","","","","","",""

 

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted

Re: separatorChar setting ignored ?

Guru
This is a reported bug, you can workaround with this issue by breaking statement "CREATE TABLE ... AS SELECT ..." from one step into two steps. 1) Create table first with definitions: CREATE TABLE separator_test ( id int, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = "\t","quoteChar"="\"","escapeChar"="\\") STORED AS TEXTFILE; 2) Then insert data into the newly created table: INSERT OVERWRITE TABLE separator_test SELECT * FROM table_name; This will force Hive to by pass the bug and insert data correctly.

View solution in original post

Highlighted

Re: separatorChar setting ignored ?

New Contributor

Small remark : In addition Eric later on gave me the advice to include the CSV Serde, because the statement still didn't run successfully.

This is vital, as this serde is located at Hive standard lib, the "add jar" should not be necessary, but at this point in time seems to be.

 

So here is that maybe important additional piece of information from Eric who worked on the case I created later after posting my question here at the community :

 

"ADD JAR /opt/cloudera/parcels/CDH/lib/hive/lib/opencsv-2.3.jar;

 

Do "ls /opt/cloudera/parcels/CDH/lib/hive/lib/opencsv*" to see whether you have the same version as mine, then try again.

The problem seems to be fixed in 5.4." (remark : my CDH version was 5.3.3)

 

 

With adding the serde jar and creating the table before executing insert overwrite table statement, the separatorChar was used and I got the table format I needed.

View solution in original post

3 REPLIES 3
Highlighted

Re: separatorChar setting ignored ?

Guru
This is a reported bug, you can workaround with this issue by breaking statement "CREATE TABLE ... AS SELECT ..." from one step into two steps. 1) Create table first with definitions: CREATE TABLE separator_test ( id int, name string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = "\t","quoteChar"="\"","escapeChar"="\\") STORED AS TEXTFILE; 2) Then insert data into the newly created table: INSERT OVERWRITE TABLE separator_test SELECT * FROM table_name; This will force Hive to by pass the bug and insert data correctly.

View solution in original post

Highlighted

Re: separatorChar setting ignored ?

New Contributor

Small remark : In addition Eric later on gave me the advice to include the CSV Serde, because the statement still didn't run successfully.

This is vital, as this serde is located at Hive standard lib, the "add jar" should not be necessary, but at this point in time seems to be.

 

So here is that maybe important additional piece of information from Eric who worked on the case I created later after posting my question here at the community :

 

"ADD JAR /opt/cloudera/parcels/CDH/lib/hive/lib/opencsv-2.3.jar;

 

Do "ls /opt/cloudera/parcels/CDH/lib/hive/lib/opencsv*" to see whether you have the same version as mine, then try again.

The problem seems to be fixed in 5.4." (remark : my CDH version was 5.3.3)

 

 

With adding the serde jar and creating the table before executing insert overwrite table statement, the separatorChar was used and I got the table format I needed.

View solution in original post

Re: separatorChar setting ignored ?

Community Manager

Thank you so much for posting the additional solution information MarcusB. It is greatly appreciated. 

 


Cy Jervis, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:
Community Guidelines
How to use the forum
Don't have an account?