Hi dear experts!
i have a challenge - i do have unsorted set of the csv files and want to sort output and distribute ranges across many files
example, input file:
1 2 7 3 2 4 5 8 6
as output i would like to have few files, like:
1 2 2 3
6 7 8
could someone recommend the hive function which could perform this?
You can use sort by function in hive to get this output.
Sort by:- It will run multiple reducers and with multiple number of sorted files but the full output is not sorted.
Hope this helps.
To read about the sort by vs order by vs distribute by vs cluster by:-