Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Explorer

Any help is greatly appreciated, I am planning to take the exam again soon, but my question is what are my choices, if I get the same error while writing a file, though my inputs are correct...

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

New Contributor

From spark 2.0.2. You can use without using databricks spark package.

df.write.format("csv").save("test.csv")

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Expert Contributor
df.rdd.map(line => line.mkString(","))
Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Contributor

Hi, I did the spark certification exam and had the same issue.

We can not download packages while launching spark as the environment is locked from any outside downloads.

--packages pull the spark-csv package from the maven repository.

Amit

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Contributor

In python, you are able to use the python csv package that comes as a standard library with the python install.

You do nat have access to Pandas, SKLearn and any package mangers such as pip and easy-install. They are locked out from the certification environment.

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Cloudera Employee

See the end of my comment above. If you can copy jar files manually you can use --jars instead of --packages.

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

Explorer

Assuming that there are three columns in DF as (col1:Int, col2:String, col3:Float), the CSV file output can be achieved as:

df.map(x => (x.getInt(0) + "," + x.getString(1) + "," + x.getFloat(2))).saveAsTextFile("out.csv")

--- OR ---

df.map(x => (x(0) + "," + x(1) + "," + x(2))).saveAsTextFile("out.csv")

Highlighted

Re: I attempted Spark Certification Exam today and found, I could not write a CSV file though I executed my .py task as below.

New Contributor
Same problem today how did you solve it? How to write a csv file from hive table without databriks obviously not present in the verisone 1.6 of spark in the exam. I feel fragile I could not complete the exam.
Don't have an account?
Coming from Hortonworks? Activate your account here