About FSunshine

FSunshine · ‎07-01-2016

I know that you can spawn two mappers of the same file if you type the addinputpath function twice with the same path, but I'd like the file to be processed slightly different each time. Specifically, I want each time to use different parameters that I passed through the Job class (with configuration.set/get). When the files are different I get the path/name of the file by using the context/inputsplit classes to achieve that, but now that they are the same I can't differentiate them. Any thoughts? Each mapper is a different maptask but i have no idea if i can use any info regarding the maptasks. Also I don't know the order the framework matches inputsplits to maptasks - it could be useful. Alternatively I could duplicate the file(using a different name), but that would be a waste of resources

Online	Offline
Last Visited	‎07-11-2016 03:25 PM

Member Since	‎07-01-2016 04:06 AM
Last Visited	‎07-11-2016 03:25 PM
Posts	3

Cloudera Community

(JAVA, MAP REDUCE) Read a File twice with Differen...