Created 12-09-2015 03:41 PM
I am trying to run a MapReduce job using(using new API) on Hadoop 2.7.1 using command line.
I have followed the below steps.
javac -cp `hadoop classpath`MaxTemperatureWithCompression.java -d /Users/gangadharkadam/hadoopdata/build jar -cvf MaxTemperatureWithCompression.jar /Users/gangadharkadam/hadoopdata/build hadoop jar MaxTemperatureWithCompression.jar org.myorg.MaxTemperatureWithCompression user/ncdc/input /user/ncdc/output
No error in compiling and creating a jar file. But on execution I am gettign the folowing error
Error Messages- Exception in thread "main" java.lang.ClassNotFoundException: org.myorg.MaxTemperatureWithCompression at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.util.RunJar.run(RunJar.java:214) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Java Code-
package org.myorg;
//Standard Java Classes
import java.io.IOException;
import java.util.regex.Pattern;
//extends the class Configured, and implements the Tool utility class
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.util.GenericOptionsParser;
//send debugging messages from inside the mapper and reducer classes
import org.apache.log4j.Logger;
//Job class in order to create, configure, and run an instance of your MapReduce
import org.apache.hadoop.mapreduce.Job;
//extend the Mapper class with your own Map class and add your own processing instructions
import org.apache.hadoop.mapreduce.Mapper;
//extend it to create and customize your own Reduce class
import org.apache.hadoop.mapreduce.Reducer;
//Path class to access files in HDFS
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
//pass required paths using the FileInputFormat and FileOutputFormat classes
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
//Writable objects for writing, reading,and comparing values during map and reduce processing
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.compress.GzipCodec;
public class MaxTemperatureWithCompression extends Configured implements Tool {
private static final Logger LOG = Logger.getLogger(MaxTemperatureWithCompression.class);
//main menhod to invoke the toolrunner to create instance of MaxTemperatureWithCompression
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new MaxTemperatureWithCompression(), args);
System.exit(res);
}
//call the run method to configure the job
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: MaxTemperatureWithCompression <input path> " + "<output path>");
System.exit(-1);
}
Job job = Job.getInstance(getConf(), "MaxTemperatureWithCompression");
//set the jar to use based on the class
job.setJarByClass(MaxTemperatureWithCompression.class);
//set the input and output path
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//set the output key and value
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//set the compressionformat
/*[*/FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);/*]*/
//set the mapper and reducer class
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
return job.waitForCompletion(true) ? 0 : 1;
}
//mapper
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException,InterruptedException {
String line = value.toString();
String year = line.substring(15,19);
int airTemperature;
if (line.charAt(87) == '+') {
airTemperature = Integer.parseInt(line.substring(88, 92));
}
else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92,93);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}
//reducer
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}
}
I checked jar file and the folder structure org/myorg/MaxTemperatureWithCompression.class is present. What could be the reason for this error. Any help in resolving this is highly apprciated. Thanks.
Created 12-10-2015 03:35 AM
Hi @Gangadhar Kadam,
You've got eveything almost right. When you build the jar, you need to move into the build directory and then trigger the jar -cvzf command to avoid having the "build part of the directory hierachy put into the JAR. So, the following should work:
javac -cp `hadoop classpath`MaxTemperatureWithCompression.java -d /Users/gangadharkadam/hadoopdata/build cd /Users/gangadharkadam/hadoopdata/build jar -cvf MaxTemperatureWithCompression.jar . hadoop jar MaxTemperatureWithCompression.jar org.myorg.MaxTemperatureWithCompression user/ncdc/input /user/ncdc/output
Try it out and compare the results of jar -tf MaxTemperatureWithCompression.jar. You should see:
[root@sandbox build]# jar -tf MaxTemperatureWithCompression.jar META-INF/ META-INF/MANIFEST.MF org/ org/myorg/ org/myorg/MaxTemperatureWithCompression.class org/myorg/MaxTemperatureWithCompression$Map.class org/myorg/MaxTemperatureWithCompression$Reduce.class
Whereas currently your steps result in:
[root@sandbox test]# jar -tf MaxTemperatureWithCompression.jar META-INF/ META-INF/MANIFEST.MF build/org/ build/org/myorg/ build/org/myorg/MaxTemperatureWithCompression.class build/org/myorg/MaxTemperatureWithCompression$Map.class build/org/myorg/MaxTemperatureWithCompression$Reduce.class
This works for me on my HDP 2.3 Sandbox.
Created 12-10-2015 03:35 AM
Hi @Gangadhar Kadam,
You've got eveything almost right. When you build the jar, you need to move into the build directory and then trigger the jar -cvzf command to avoid having the "build part of the directory hierachy put into the JAR. So, the following should work:
javac -cp `hadoop classpath`MaxTemperatureWithCompression.java -d /Users/gangadharkadam/hadoopdata/build cd /Users/gangadharkadam/hadoopdata/build jar -cvf MaxTemperatureWithCompression.jar . hadoop jar MaxTemperatureWithCompression.jar org.myorg.MaxTemperatureWithCompression user/ncdc/input /user/ncdc/output
Try it out and compare the results of jar -tf MaxTemperatureWithCompression.jar. You should see:
[root@sandbox build]# jar -tf MaxTemperatureWithCompression.jar META-INF/ META-INF/MANIFEST.MF org/ org/myorg/ org/myorg/MaxTemperatureWithCompression.class org/myorg/MaxTemperatureWithCompression$Map.class org/myorg/MaxTemperatureWithCompression$Reduce.class
Whereas currently your steps result in:
[root@sandbox test]# jar -tf MaxTemperatureWithCompression.jar META-INF/ META-INF/MANIFEST.MF build/org/ build/org/myorg/ build/org/myorg/MaxTemperatureWithCompression.class build/org/myorg/MaxTemperatureWithCompression$Map.class build/org/myorg/MaxTemperatureWithCompression$Reduce.class
This works for me on my HDP 2.3 Sandbox.
Created 12-10-2015 04:34 AM
Thanks Brandon.It worked.