Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDPCD-Spark: For questions that require writing the output to a CSV File, is the requirement 1 File or Multi Part Files?

Highlighted

HDPCD-Spark: For questions that require writing the output to a CSV File, is the requirement 1 File or Multi Part Files?

New Contributor
As you know the default behavior of Spark is to create multi-part files when using the .saveAsTextFile API. However I am not clear in the HDPCD-Spark Certification Exam, whether the question expects a single file or a multi-part files as output ? Can somebody shed some light on this? I typically output file by concatenating the fields seperated by a comma and calling .saveAsTextFile i.e. .map(x => x.1+","+x.2+","+...).saveAsTextFile(file_name.csv). Is there something wrong with this approach ?