Member since
10-12-2014
8
Posts
0
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1871 | 10-28-2014 07:58 PM | |
2632 | 10-26-2014 10:17 AM |
10-26-2014
10:17 AM
Gautam, You are right. The treeset is treated like any collection object within MR The following Mapper code worked for me import java.io.IOException; import java.util.Iterator; import java.util.TreeSet; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class KPWordCountMapper extends Mapper<LongWritable, Text, Text, NullWritable>{ int count = 0; @Override public void map(LongWritable inputKey,Text inputVal,Context context) throws IOException,InterruptedException { TreeSet<String> ts = new TreeSet<>(); String line = inputVal.toString(); String[] splits = line.split("\\W+"); for(String outputKey:splits) if(outputKey.length() > 0){ ts.add(outputKey); } Iterator<String> itr= ts.iterator(); while(itr.hasNext()){ context.write(new Text(itr.next()),NullWritable.get()); } } }
... View more