Support Questions
Find answers, ask questions, and share your expertise

How to save each part file of single object into different/separate directories.

New Contributor

#I have saved one RDD with 4 part files initially underneath of one directory.

But I have a use case That I need do separate each part file of a particular data set should save in different directories.

2 REPLIES 2

Re: How to save each part file of single object into different/separate directories.

@TIRUPATI CHELLARAPU

I think this should be possible by using rdd.foreachPartition, then you could hopefully store each partition in separate file/directory.

Similar solution is described here:

https://stackoverflow.com/questions/30338213/writing-rdd-partitions-to-individual-parquet-files-in-i...

as a simpler alternative they also suggest

df.write.partitionBy("year", "month", "day").parquet("/path/to/output")

which will create directory structure for the partitioned columns of the dataframe.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Re: How to save each part file of single object into different/separate directories.

New Contributor

Thank you for your information. But I need like below

I already have a directory contains four part files

ex:

merchant_table (main dir)

p00000

p00001

p00002

p00003

Again, I need to save the above one as separate four directories and one part file for that like below

merchant_table1 ( dir1)

p00000

merchant_table1 ( dir2)

p00001

merchant_table1 ( dir3)

p00002

merchant_table1 ( dir4)

p00003