Reply
Highlighted
Explorer
Posts: 17
Registered: ‎09-17-2016

Map reduce Java programming reading the 1st column

I have two columns: id (int), string and I need to find the duplicacy of id using MR programming and I am not allowed to read the complete line. What will be the Input key and value and what are the changes i have to do in code so that I can read only 1st column

Posts: 1,836
Kudos: 415
Solutions: 295
Registered: ‎07-31-2013

Re: Map reduce Java programming reading the 1st column

You need to implement a custom RecordReader to achieve this. Here's the source of the default simple full-line record reader that you can start off with: https://github.com/cloudera/hadoop-common/blob/cdh5.8.0-release/hadoop-mapreduce-project/hadoop-mapr...
Announcements