I have text data already in HDFS; it's comma-separated, but the cells in the table themselves contain commas, so doing a straightforward table creation with the clause
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
delimits on text within a cell, which I would like to prevent. There is a custom SerDe for Hive that allows you to choose separators to avoid this problem (see http://dev.bizo.com/2010/11/csv-and-hive.html ).
Is there something similar for Impala?
If not, is there a workaround?
I'm using CDH4.4.
Thanks for the help and for being a great service,
If you want to keep comma as the separator, you can use the ESCAPED BY clause to define an escape character for the table
(usually ESCAPED BY '\\' to use the familiar backslash escape)
and then any commas within the field values, rewrite them as \,
Or to use | or \t or something as the separator, you can use FIELDS TERMINATED BY as Gwen suggested.
Hope this helps,