Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase MapReduce job

avatar
Rising Star

Here is my code for the driver for mapreduce :

public int run(String[] args) throws Exception {
// TODO Auto-generated method stub
if (args.length!=1){
System.out.println("usage: [input]");
System.exit(-1);
}
Job job = Job.getInstance(getConf());
job.setJarByClass(HBaseDriver.class);
FileInputFormat.addInputPath(job
, new Path(args[0]));
job.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "students");
job.setInputFormatClass(StudentInputFormat.class);
job.setMapperClass(HBaseStudentsMapper.class);
job.setOutputFormatClass(TableOutputFormat.class);
job.setNumReduceTasks(0);
returnjob.waitForCompletion(true)? 0 :1;
}

Looks like the system fails on this line :

job.setOutputFormatClass(TableOutputFormat.class);

the message is:

Exception in thread "main" java.lang.NoClassDefFoundError:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/protobuf/generated/MasterProtos$MasterService$BlockingInterface

The interesting thing is that if I comment this line and do some Hbase stuff in the mapper it works.

More over I have he hbase-protocol library and I supply it with my libjars command

here is how I run the map reduce:

hadoop jar mo.jar HBaseDriver -libjars hbase-client.jar,hbase-common.jar,hbase-protocol.jar,htrace-core-2.00.jar,hbase-server.jar input
1 ACCEPTED SOLUTION

avatar
Rising Star

Hi,

I managed to solve it

There are 2 things need to be done for this to work:

First add few jars in the libjars and second add hbase lib to hadoop class path and then it should work.

Here are the libjars needed

hbase-client.jar,hbase-server.jar,hbase-protocol.jar,hbase-common.jar,htrace-core-2.00.jar

pay attention to htrace-core-2.00.jar which is Cloudera jar and I am using hortonworks sandbox 2.2.4.2-2

and need to add the hbase lib to hadoop-env.sh otherwise you wont be able to run the main thread

Thank you for your help hope it helps!!

View solution in original post

10 REPLIES 10

avatar
Master Mentor

@Avraha Zilberman have you looked at these examples? Can you show your pom file as well? You'd ususally call a built-in utility for HBase like so

TableMapReduceUtil.initTableMapperJob(
  tableName,        // input HBase table name
  scan,             // Scan instance to control CF and attribute selection
  MyMapper.class,   // mapper
  null,             // mapper output key
  null,             // mapper output value
  job);
job.setOutputFormatClass(NullOutputFormat.class);

avatar
Rising Star

OK, I tried the examples and I get the same exception

I think it means that the problem is with the deployment..

I don't have a pom I just create a jar and put it with relevant jars and then run it with hadoop jar

avatar
Rising Star

It is almost the same exception from the same package:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/protobuf/generated/ClientProtos

avatar
Master Mentor

please add hbase-client and hadoop-client to your pom.

avatar
Master Mentor

@Avraha Zilberman execute the job only specifying hbase-client and hadoop-client, if still doesn't work add hbase-server but that's it.

avatar
Rising Star

Do you mean these files?

hbase-client.jar

hadoop-mapreduce-client-core.jar

avatar
Master Mentor

hbase-client and hadoop-client. If you're running version 2.7.1 of Hadoop and HBase 1.1.2 there are two distinct versions of both jars available. @Avraha Zilberman

avatar

Check out this URL

https://hbase.apache.org/book.html#hbase.mapreduce.classpath

You need to add certain jars to HADOOP_CLASSPATH when you launch your map reduce job using either hadoop or yarn command.

avatar
Rising Star

Hi,

I managed to solve it

There are 2 things need to be done for this to work:

First add few jars in the libjars and second add hbase lib to hadoop class path and then it should work.

Here are the libjars needed

hbase-client.jar,hbase-server.jar,hbase-protocol.jar,hbase-common.jar,htrace-core-2.00.jar

pay attention to htrace-core-2.00.jar which is Cloudera jar and I am using hortonworks sandbox 2.2.4.2-2

and need to add the hbase lib to hadoop-env.sh otherwise you wont be able to run the main thread

Thank you for your help hope it helps!!