Support Questions
Find answers, ask questions, and share your expertise

Map reduce Java programming reading the 1st column

Contributor

I have two columns: id (int), string and I need to find the duplicacy of id using MR programming and I am not allowed to read the complete line. What will be the Input key and value and what are the changes i have to do in code so that I can read only 1st column

1 REPLY 1

Master Guru
You need to implement a custom RecordReader to achieve this. Here's the source of the default simple full-line record reader that you can start off with: https://github.com/cloudera/hadoop-common/blob/cdh5.8.0-release/hadoop-mapreduce-project/hadoop-mapr...