@TIRUPATI
CHELLARAPU
I think this should be possible by using rdd.foreachPartition, then you could hopefully store each partition in separate file/directory.
Similar solution is described here:
https://stackoverflow.com/questions/30338213/writing-rdd-partitions-to-individual-parquet-files-in-i...
as a simpler alternative they also suggest
df.write.partitionBy("year", "month", "day").parquet("/path/to/output")
which will create directory structure for the partitioned columns of the dataframe.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.