Posts: 25
Registered: ‎07-09-2015

How to implement ChainMapper and ChainReducer using Oozie?



I am using Hadoop 2.6.0 and Oozie 4.1.0. I have to use ChainMapper and ChainReducer in my MapReduce jobs.


While running the Mapreduce job, I am creating .jar using only my Mapper and Reducer Class as Oozie takes care of the Driver functionality. I was trying to look for an example to setup ChainMapper and ChainReducer in Oozie workflow but could not find it. Could you please point me to a documentation/post that could help me how to achieve this.



Cloudera Employee
Posts: 314
Registered: ‎01-16-2014

Re: How to implement ChainMapper and ChainReducer using Oozie?

The ChainMapper and ChainReducer are an extension of the normal Mapper and Reducer. Oozie should only be referencing the one job in which you reference the Chain* and the rest you do within the code.

Check the ChainMapper API doc and or ChainReducer API doc for the code use them in the oozie as a normal mapper and reducer



Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How to implement ChainMapper and ChainReducer using Oozie?

Technically, Oozie will require a set of property XMLs describing your job, and will then be able to run it the way you want it.

In your MR job, you can get an "XML" descriptor from a previously submitted Job's job.xml file, or by doing something such as JobConf.writeXml(System.out) instead of submitting it. You can then use the printed XML to extract and use in Oozie as the full descriptor of what your driver sets up.

This is fairly lengthy a process, which Oozie realises, and instead offers you a better approach of using a little bit more of Java code to run your configuration parts (the driver, in your case the main class that sets up the ChainMapper/ChainReducer descriptions) directly into an MR action. I'd passed this info on a previous thread of yours, but here it is again: