Member since
07-01-2016
3
Posts
0
Kudos Received
0
Solutions
07-01-2016
04:16 AM
1 Kudo
While MultipleInputs was designed for such a thing, your requirement is unique in that you need to process the same input 2x but with different params each time. It seems a bit redundant to me given that you can do it in a single task run vs. 2x the I/O cost… But I believe the way you can solve your identifier problem is by writing your own InputFormat wrapper over the existing InputFormat, which generates special types of InputSplit objects (wrapper over regular FileSplit classes). These input splits need to add in your identifiers as an extra field, and you can extract and cast the same from your context.getInputSplit() in the map-end to then differentiate the input.
... View more