For a given MR job, i need to produce two output files. One file should be the output of Mapper Another file should be the output of Reducer (which is just an aggregation of above Mapper)
Can I have the both the mapper and reducer output be written in a single job?
As of now I am using 2 jobs with Job chaining mechanism.
In Job 1 (Only Mapper phase) Output contains 20 fields in a single row, which has to be written to HDFS(as file1). In Job 2 (Mapper n reducer phases) Mapper takes input from Job1 output, deletes few fields to bring into a standard format(only 10 fields) and pass it to reducer which writes file2.
I need both file1 and file2 in hdfs... Now My doubt is, whether in Job1 mapper can I write data into hdfs as file1, then modify the same data and pass it to reducer, so that reducer generates file2.