Member since
03-21-2014
4
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5666 | 04-01-2014 02:59 AM |
04-01-2014
02:59 AM
Hi just to follow up on this, I have now solved the problem. There were two things that I needed to do: 1. In addition to adding oozie.libpath to my job.properties, I also needed to include oozie.use.system.libpath=true 2. Before I was using the following line to add files to the DistributedCache: FileStatus[] status = fs.listStatus("/application/lib");
if (status != null) {
for (int i = 0; i < status.length; ++i) {
if (!status[i].isDir()) {
DistributedCache.addFileToClassPath(status[i].getPath(), job.getConfiguration(), fs);
}
}
} This appeared to be causing a classpath issue because it was adding hdfs://hostname before the hdfs path. Now I am using the following to remove that and only add the absolute hdfs path: FileStatus[] status = fs.listStatus("/application/lib");
if (status != null) {
for (int i = 0; i < status.length; ++i) {
if (!status[i].isDir()) { Path distCachePath = new Path(status[i].getPath().toUri().getPath());
DistributedCache.addFileToClassPath(distCachePath, job.getConfiguration(), fs);
}
}
} Thankyou to those that replied to my original query for pointing me in the right direction. Andrew
... View more
03-25-2014
08:07 AM
Hi, Can you try adding "oozie.libpath=/application/lib" to your job.properties and see if that helps? I added the "oozie.libpath" property to my job.properties files. (In the format "hdfs://fqdn:8020/application/lib") The MapReduce job ran successfully. However, I had previously added local copies of my jars to the HADOOP_CLASSPATH using the "MapReduce Service Environment Safety Valve" property in Cloudera Manager (ie HADOOP_CLASSPATH=/localpath/foo.jar:/localpath/bar.jar) Upon removing this my MapReduce job failed with the same CNF error as before. Any ideas why I can only get this working with the jars on hdfs and the local file systems? Surely I should only have the jars on hdfs? Thanks for yor help, Andrew
... View more
03-21-2014
09:58 AM
Thankyou for the responses. Where are you storing the jars that you need in the distributed cache? The jars are stored on hdfs under "/application/lib" Are they in the "${oozie.wf.application.path}/lib" or another location in HDFS? I'm using ${oozie.coord.application.path} in my job.properties file. If I use ${oozie.wf.application.path} instead then I get a CNF error because it can't find my ToolRunner class. Should I be using ${oozie.wf.application.path} but adding my ToolRunner class to the HADOOP_CLASSPATH? Also, you're not trying to access HBase from the MR job, are you? I'm not using HBase. The MapReduce job ingests into Accumulo. Which class are you getting the CNF exception on? The CNF exception is on a class within one of my jar files. It's the super class of my Mapper. Thanks, Andrew
... View more
03-21-2014
09:11 AM
Hi, I'm getting a ClassNotFoundException when running a MapReduce job using Oozie. I'm using CDH4.2.1 The command I am using to start my job is: oozie job -oozie http://localhost:11000/oozie -config job.properties -DstartDateTime=`date +%FT%RZ` I have multiple jars which I am adding to the classpath using: DistributedCache.addArchiveToClassPath() However, these jars do no appear to be on the classpath of my MapReduce job. I have two workarounds 1. Using the 'hadoop jar -libjars' command. 2. If I package all the jars as a jar-with-dependencies. I suspect that this works because the jar-with-dependencies is getting added to the classpath by job.setJarByClass(). This would imply that there is a problem with the DistributedCache. Does anyone have any ideas how I can get this working through Oozie with multiple jars? Thanks, Andrew
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Oozie
-
MapReduce