Any help is greatly appreciated, I am planning to take the exam again soon, but my question is what are my choices, if I get the same error while writing a file, though my inputs are correct...
From spark 2.0.2. You can use without using databricks spark package.
df.rdd.map(line => line.mkString(","))
Hi, I did the spark certification exam and had the same issue.
We can not download packages while launching spark as the environment is locked from any outside downloads.
--packages pull the spark-csv package from the maven repository.
In python, you are able to use the python csv package that comes as a standard library with the python install.
You do nat have access to Pandas, SKLearn and any package mangers such as pip and easy-install. They are locked out from the certification environment.
See the end of my comment above. If you can copy jar files manually you can use --jars instead of --packages.
Assuming that there are three columns in DF as (col1:Int, col2:String, col3:Float), the CSV file output can be achieved as:
df.map(x => (x.getInt(0) + "," + x.getString(1) + "," + x.getFloat(2))).saveAsTextFile("out.csv")
--- OR ---
df.map(x => (x(0) + "," + x(1) + "," + x(2))).saveAsTextFile("out.csv")
Same problem today how did you solve it? How to write a csv file from hive table without databriks obviously not present in the verisone 1.6 of spark in the exam. I feel fragile I could not complete the exam.