Support Questions

Find answers, ask questions, and share your expertise

CSV Coma Delimiter Split in Spark RDD but NOT to split coma with in double quotes

avatar
New Contributor

I have a CSV file with data as below

id,name,comp_name

1,raj,"rajeswari,motors"

2,shiva,amber kings

my requirement is to read this file to spark RDD, then do map split with coma delimiter. but giving code this splits all comas val splitdata = data.map(_.split(",")

i do not want to split coma with in double quotes. But I DO NOT want to use REGEX expression. is there any simple efficient method to achieve this?

 

Also 2nd requirement is read above csv file to Spark Dataframe and show it but I need to see double quotes in result output should look like in table

id            name                                                   comp_name

1              raj                                                          "rajeswari,motors"

2              shiva                                                     amber kings

double quotes are not shown normally if you read csv to data frame but is any way to do it?

I am using spark 2.4 / Scala 2.11 / Eclipse IDE

1 REPLY 1

avatar
New Contributor

Hi! Did you find the answer? Right now I'm dealing with the same issue