As we know that, due to restricted environment for HDPCD-Spark exam, can't download any 3rd party jars.
And as we also know that, we can save/load DF to/from JSON/ORC/PARQUET file formats.
However, there is an issue with CSV files.
Hence, my question is that:
How to save the DataFrame to a CSV file using pure Spark Core or Spark SQL APIs? & vice-a-versa.
You have to convert the DF to a table, then save the table like:
val myDFTable = sqlContext.sql("SELECT col1, col2, col3 FROM myTempTable WHERE col2 > 1000")
myDFTable.map(x => x(0) + "," + x(1) + "," + x(2)).saveAsTextFile("output.csv")
View solution in original post