Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to save each column in spark dataframe to individual file without using for loop?

How to save each column in spark dataframe to individual file without using for loop?

New Contributor

Hi ,

I have a dataframe with typically "100+" columns , i'll need to save each column to a separate file .

I could achieve it via looping however that's not helping me using spark's parallelism.

Can someone suggest a better workaround with use of parallelism ?

Input:-

Name,Age,Gender

John,25,Male

Amy,29,Female

Doe,36,Male

Desired Output:-

Name folder should contain a part-file consisting of values :- John,Amy and Doe

Age folder should contain a part-file consisting of values :- 25,29,36

Gender folder should contain a part-file consisting of values :- Male,Female,Male.

Can someone help ?

Thanks in advance :)

Don't have an account?
Coming from Hortonworks? Activate your account here