Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

YARN: Client config for remote submission

avatar
New Contributor

(cross posting this from the scm-user ML, which I find easier to use... sorry if this is bad form)

 

Hello folks,

I recently upgraded one of my clusters from MR1 to YARN.

In the past, I was able to download the client configs from CM onto a development machine, such that I could create a hadoop-conf alternative using this client config that allowed me to access the cluster remotely (i.e. from the dev machine).
 
However, I'm having trouble doing the same thing now that the YARN service is active.
 
I download the YARN client config and unzip it, and I notice there are some new files.
 
In particular, I see a file "mrapp-generated-classpath" which contains the content:
 
{{CDH_HADOOP_HOME}}/*:{{CDH_HADOOP_HOME}}/lib/*

 

I'm not too familiar with this syntax, but:
  * I cannot source this file in a shell
  * I am not sure where CDH_HADOOP_HOME is supposed to be defined.
 
 
In any event, in attempting to run the YARN example wordcount I get:
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/lib/partition/InputSampler$Sampler
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2521)
at java.lang.Class.getMethod0(Class.java:2764)
at java.lang.Class.getMethod(Class.java:1653)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:60)
at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:103)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.lib.partition.InputSampler$Sampler
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 12 more

 

 
which I expect is a classpath/env issue.
 
I try manually exporting CDH_HADOOP_HOME and HADOOP_MAPRED_HOME to /usr/lib/hadoop but this isn't working either.
 
Anyone have any ideas?
 
My dev machine is Quantal with CDH 4.4.
 
Thanks
1 ACCEPTED SOLUTION

avatar
New Contributor

Posting Darren Lo's response for the ML:

 

Hi Joe,

 
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
 
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
 
Thanks,
Darren

View solution in original post

2 REPLIES 2

avatar
New Contributor

Posting Darren Lo's response for the ML:

 

Hi Joe,

 
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
 
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
 
Thanks,
Darren

avatar
Contributor

I fix this by adding following line:

export CDH_MR2_HOME=$HADOOP_HOME

in .bash_profile