I'm trying to write an ordered Dataframe/Dataset into multiples CSV Files, and preserve both global and local sort.
I have the following code :
df .orderBy("date") .coalesce(100) .write .csv(...)
Does this code guarantee that :
- I will have 100 output files
- Each single CSV file is locally sorted, I mean by the "date" column ascending
- Files are globally sorted, I mean CSV part-0000 have "date" inferior to CSV part-0001, CSV part-0001 have "date" inferior to CSV part-0002 and so on ..