Support Questions

SriniDagda · ‎05-09-2019

Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.mapreduce.api.WordCountDriver$WordCountMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2349)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:196)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class com.mapreduce.api.WordCountDriver$WordCountMapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2255)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2347)

I face the above issue when running the below program in AWS , Cloudera instance

public class WordCountDriver extends Configured implements Tool {

public static void main(String[] args) throws Exception {

int returnStatus = ToolRunner.run(new Configuration(), new WordCountDriver(), args);

System.exit(returnStatus);

}

public int run(String[] args) throws IOException, InterruptedException, ClassNotFoundException {

Configuration configuration = new Configuration();

String[] files = new GenericOptionsParser(configuration, args).getRemainingArgs();

Job job = Job.getInstance(configuration, "WordCount MR");

job.setJarByClass(WordCountDriver.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class);

job.setMapOutputKeyClass(Text.class);

job.setMapOutputValueClass(IntWritable.class);

job.setMapperClass(WordCountMapper.class);

job.setReducerClass(WordCountReducer.class);

FileInputFormat.addInputPath(job, new Path(files[0]));

FileOutputFormat.setOutputPath(job, new Path(files[1]));

return (job.waitForCompletion(true)?0:1);

}

public static class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

private IntWritable ONE = new IntWritable(1);

@Override

protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

String[] records = value.toString().split(" ");

for (String record: records) {

context.write(new Text(record), ONE);

}

public static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

@Override

protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

int count = 0;

for (IntWritable value: values) {

count += value.get();

}

context.write(key, new IntWritable(count));

}

Harsh J · ‎05-09-2019

Can you describe the steps used for building the jar from your compiled program?

Use the 'jar tf' command to check if all 3 of your class files are within it, and not just the WordCountDriver.class file.

SriniDagda · ‎05-10-2019

I use these commands as specified in

https://www.cloudera.com/documentation/other/tutorial/CDH5/topics/ht_usage.html#topic_5_2

javac -cp /opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/* WordCountDriver.java -d work -Xlint

jar -cvf wordcount.jar -C work/ .

When I run this command

jar tf wordcount.jar

it shows these files

META-INF/
META-INF/MANIFEST.MF
home/ec2-user/work/
home/ec2-user/work/WordCountDriver.java
home/ec2-user/work/com/
home/ec2-user/work/com/mapreduce/
home/ec2-user/work/com/mapreduce/api/
home/ec2-user/work/com/mapreduce/api/WordCountDriver$WordCountMapper.class
home/ec2-user/work/com/mapreduce/api/WordCountDriver$WordCountReducer.class
home/ec2-user/work/com/mapreduce/api/WordCountDriver.class
home/ec2-user/work/wordcount.jar
WordCountDriver.java
com/
com/mapreduce/
com/mapreduce/api/
com/mapreduce/api/WordCountDriver$WordCountMapper.class
com/mapreduce/api/WordCountDriver$WordCountReducer.class
com/mapreduce/api/WordCountDriver.class

Harsh J · ‎05-12-2019

Thank you for sharing that output. The jar does appear to carry one class set in the right package directory, but also carries another set under a different directory.

Perhaps there is a versioning/work-in-progress issue here, where the incorrect build is the one that ends up running.

Can you try to build your jar again from a clean working directory?

If the right driver class runs, you should not be seeing the following observed log:

> 19/05/11 02:43:49 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).