About Skill_Fresh

Skill_Fresh · ‎10-28-2014

Answer is C.

Skill_Fresh · ‎10-27-2014

What is correct about Oozie(Select one) What is correct about oozie? (Select one) A. Oozie server runs in namenode. B. Oozie server must run as a process in the hadoop cluster. C. Oozie server runs as a separate hadoop client process outside the hadoop cluster. D. Oozie server can run in all the Datanodes but not in namenode as hadoop process.

Skill_Fresh · ‎10-26-2014

Gautam, You are right. The treeset is treated like any collection object within MR The following Mapper code worked for me import java.io.IOException; import java.util.Iterator; import java.util.TreeSet; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class KPWordCountMapper extends Mapper<LongWritable, Text, Text, NullWritable>{ int count = 0; @Override public void map(LongWritable inputKey,Text inputVal,Context context) throws IOException,InterruptedException { TreeSet<String> ts = new TreeSet<>(); String line = inputVal.toString(); String[] splits = line.split("\\W+"); for(String outputKey:splits) if(outputKey.length() > 0){ ts.add(outputKey); } Iterator<String> itr= ts.iterator(); while(itr.hasNext()){ context.write(new Text(itr.next()),NullWritable.get()); } } }

Skill_Fresh · ‎10-20-2014

Gautam, Here is the sample code I tried. I set the numberof reducers to 0 to check the map output. I am getting errors shown at the end import java.util.Iterator; import java.util.TreeSet; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class KPWordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{ int count = 0; @Override public void map(LongWritable inputKey,Text inputVal,Context context) throws IOException,InterruptedException { TreeSet<String> ts = new TreeSet<>(); String line = inputVal.toString(); String[] splits = line.split("\\W+"); for(String outputKey:splits) if(outputKey.length() > 0){ ts.add(outputKey); } Iterator<String> itr= ts.iterator(); while(itr.hasNext()){ //System.out.println(itr.next()); context.write(new Text(itr.next()),new IntWritable(itr.next().length())); } } } 14/10/21 03:14:12 INFO input.FileInputFormat: Total input paths to process : 5 14/10/21 03:14:12 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/10/21 03:14:12 WARN snappy.LoadSnappy: Snappy native library not loaded 14/10/21 03:14:12 INFO mapred.JobClient: Running job: job_201410120206_0080 14/10/21 03:14:13 INFO mapred.JobClient: map 0% reduce 0% 14/10/21 03:14:28 INFO mapred.JobClient: Task Id : attempt_201410120206_0080_m_000000_0, Status : FAILED java.util.NoSuchElementException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1113) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at KPWordCountMapper.map(KPWordCountMapper.java:51) at KPWordCountMapper.map(KPWordCountMapper.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) Any ideas??

Skill_Fresh · ‎10-13-2014

Online	Offline
Last Visited	‎10-31-2014 08:47 PM

Member Since	‎10-12-2014 08:14 PM
Last Visited	‎10-31-2014 08:47 PM
Posts	8

Cloudera Community

Re: hadoop concept question

Re: Is it possible to use a Treeset collection obj...

Re: hadoop concept question

hadoop concept question

Re: Is it possible to use a Treeset collection obj...

Re: Is it possible to use a Treeset collection obj...

Is it possible to use a Treeset collection object ...