Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to read csv have comma in cell and new line character as well

Solved Go to solution
Highlighted

how to read csv have comma in cell and new line character as well

how to read csv have comma in cell and new line character as well . like i have somme columns with description (hi,how are you).

 i am trying to read with spark in scala .

please guide

1 ACCEPTED SOLUTION

Accepted Solutions

Re: how to read csv have comma in cell and new line character as well

Cloudera Employee

Can you use sqoop to retrieve the data directly from the database and dump it into Hive? That will solve your delimiter problem.

View solution in original post

9 REPLIES 9
Highlighted

Re: how to read csv have comma in cell and new line character as well

Cloudera Employee

I would recommend not using CSV in your case. If you have commas in the fields then you can't really delimit them with commas because, as you have noticed, you will have field breaks in the middle of a field.

 

Can you get the source data exported some other way?

Highlighted

Re: how to read csv have comma in cell and new line character as well

Right and Thank You. What format you suggest?

Highlighted

Re: how to read csv have comma in cell and new line character as well

do you think ? does it work if i copy file direct into mysql then replace commas and after read with spark?

Highlighted

Re: how to read csv have comma in cell and new line character as well

Cloudera Employee

Where is the data coming from? You could use a binary format like Avro or Parquet if your source system can export that way. If you MUST have a text file with a delimiter, you need a delimiter that is not anywhere in the data.

Highlighted

Re: how to read csv have comma in cell and new line character as well

it is coming from oracle database

Re: how to read csv have comma in cell and new line character as well

Cloudera Employee

Can you use sqoop to retrieve the data directly from the database and dump it into Hive? That will solve your delimiter problem.

View solution in original post

Highlighted

Re: how to read csv have comma in cell and new line character as well

great option. Thank You Very Much

Highlighted

Re: how to read csv have comma in cell and new line character as well

New Contributor

Have you tried using openSerde In DDL?

Create Table tablename(columns)

ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde';

 

Thanks,

Manoj

Highlighted

Re: how to read csv have comma in cell and new line character as well

actually, i am getting file from ftp and reading using spark. i didn't try to push directly to hive because some enrichment i have to made. this is a big problem for me.

Thanks

Don't have an account?
Coming from Hortonworks? Activate your account here