Support Questions

HarisKhan · ‎04-12-2020

I am using spark version 2.4.0. I know that Backslash is default escape character in spark but still I am facing below issue.

I am reading a csv file into a spark data frame (using pyspark language) and writing back the data frame into csv.

I have some "//" in my source csv file (as mentioned below), where first Backslash represent the escape character and second Backslash is the actual value.

Test.csv (Source Data)
--------
Col1,Col2,Col3,Col4
1,"abc//",xyz,Val2
2,"//",abc,Val2

I am reading the Test.csv file and creating dataframe using below piece of code:
df = sqlContext.read.format('com.databricks.spark.csv').schema(schema).option("escape", "\\").options(header='true').load("Test.csv")

And reading the df dataframe and writing back to Output.csv file using below code:
df.repartition(1).write.format('csv').option("emptyValue", empty).option("header", "false").option("escape", "\\").option("path", 'D:\TestCode\Output.csv').save(header = 'true')

Output.csv
----------
Col1,Col2,Col3,Col4
1,"abc//",xyz,Val2
2,/,abc,Val2

In 2nd row of Output.csv, escape character is getting lost along with the quotes("").
My requirement is to retain the escape character in output.csv as well. Any kind of help will be much appreciated.

Thanks in advance

aakulov · ‎04-14-2020

Can you show the output of the print statement for your dataframe df? That way we can tell how Spark interprets the read and whether is problem is with the read or the write operation.

Cloudera Community

Support Questions

Escape Backslash(/) while writing spark dataframe into csv

Reading CSV File Spark - Issue with Backslash

Spark RDDs vs DataFrames vs SparkSQL

Write / Read Parquet File in Spark

Spark 2 Can't write dataframe to parquet table

How to read hexadecimal escape sequences from Spar...

Impala writes on Iceberg

Hive Warehouse Connector concurrent write to Hive ...

Reading from and Writing to HBase with a spark Dat...

Spark Python Supportability Matrix

Writing parquet on HDFS using Spark Streaming