In hive, One of my column data contains Pipe as the part of the data('|'), however, while exporting data from this table, we need to export the data as the pipe('|') as the delimiter between each fields, How to handle if the delimiters as part of the data while creating the flat file from the hive table.
Can you enclose your data fields in double quotes, "", when exporting to HDFS?
Notice how I have pipe (|) as the delimiter and it is also part of the column data, but my columns are enclosed in double quotes.
Then use the OpenCSV Serde when creating the DDL for your Hive table.
For example, if I wanted to create a Hive table for the sample record above, and the file is already in a directory named 'test' in HDFS, then my DDL statement would be:
create external table my_table (
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES (
"separatorChar" = "|",
"quoteChar" = "\""
Now you can query the table and Hive will correctly display the data.
Thanks and Yah .. Open Csv serde will do it.However, i am looking if there any other alternatives.
Could you let me know if the data is not in Quto's, How to Handle it and below is the example
column 1| column 2
In the above example the first|second are actually one column.
Could you let me know how to handle if the data is not in quoto's and if the delimiter is part of the data.
Any suggestion or help is appreciated.
Hi Reddy.. Choose a delimiter which will not used easily in a data. Choose unicode as delimiter it will solve your issue. 90% of the data will not contain unicode. (row format delimited Fields terminated by '/u0001') . In your case export the the data with '/u0001' as delimiter and then insert into a hive table which has delimiter as '|'