Created 07-22-2016 08:45 PM
How to save the data inside a dataframe to text file in csv format in HDFS?
Tried the following but csv doesn't see to be a supported format
df.write.format("csv").save("/filepath")
Created 07-22-2016 08:54 PM
The best way to save dataframe to csv file is to use the library provide by Databrick Spark-csv
It provides support for almost all features you encounter using csv file.
spark-shell --packages com.databricks:spark-csv_2.10:1.4.0
then use the library API to save to csv files
df.write.format("com.databricks.spark.csv").option("header", "true").save("file.csv")
It also support reading from csv file with similar API
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("file.csv")
You could also write some custom code to create the output string using mkString, but it won't be safe if you encounter special characters and won't be able to handle quote, etc..
df.map(x => x.mkString("|")).saveAsTextFile("file.csv")
Created 07-22-2016 08:54 PM
The best way to save dataframe to csv file is to use the library provide by Databrick Spark-csv
It provides support for almost all features you encounter using csv file.
spark-shell --packages com.databricks:spark-csv_2.10:1.4.0
then use the library API to save to csv files
df.write.format("com.databricks.spark.csv").option("header", "true").save("file.csv")
It also support reading from csv file with similar API
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("file.csv")
You could also write some custom code to create the output string using mkString, but it won't be safe if you encounter special characters and won't be able to handle quote, etc..
df.map(x => x.mkString("|")).saveAsTextFile("file.csv")
Created 12-06-2016 06:42 PM
@Qi Wang I think we do not have the Databrick CSV library available in the exam.
Your approach with mkString() works well if there is no header required in the output csv file. Can I assume that in the exam tasks?