Support Questions
Find answers, ask questions, and share your expertise

how to merge reduce task output to produce final output file?

Explorer

i am using hive editor ,

in mapreduce job, map task output will be combined and reduce phase will take this output as a input. By default each reducer will generate a separate output file like part-0000 and this output will be stored in HDFS.

  • 1)After completion of reduce task , which process will be done for combining reduce phase output to produce final output in hive editor and where?
1 REPLY 1

You can set hive.merge.mapredfiles to true , it will run the another mapreduce job after the original one to combine all files into one.

https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.