Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN: Client config for remote submission

Solved Go to solution
Highlighted

YARN: Client config for remote submission

New Contributor

(cross posting this from the scm-user ML, which I find easier to use... sorry if this is bad form)

 

Hello folks,

I recently upgraded one of my clusters from MR1 to YARN.

In the past, I was able to download the client configs from CM onto a development machine, such that I could create a hadoop-conf alternative using this client config that allowed me to access the cluster remotely (i.e. from the dev machine).
 
However, I'm having trouble doing the same thing now that the YARN service is active.
 
I download the YARN client config and unzip it, and I notice there are some new files.
 
In particular, I see a file "mrapp-generated-classpath" which contains the content:
 
{{CDH_HADOOP_HOME}}/*:{{CDH_HADOOP_HOME}}/lib/*

 

I'm not too familiar with this syntax, but:
  * I cannot source this file in a shell
  * I am not sure where CDH_HADOOP_HOME is supposed to be defined.
 
 
In any event, in attempting to run the YARN example wordcount I get:
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/lib/partition/InputSampler$Sampler
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2521)
at java.lang.Class.getMethod0(Class.java:2764)
at java.lang.Class.getMethod(Class.java:1653)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:60)
at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:103)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.lib.partition.InputSampler$Sampler
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 12 more

 

 
which I expect is a classpath/env issue.
 
I try manually exporting CDH_HADOOP_HOME and HADOOP_MAPRED_HOME to /usr/lib/hadoop but this isn't working either.
 
Anyone have any ideas?
 
My dev machine is Quantal with CDH 4.4.
 
Thanks
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: YARN: Client config for remote submission

New Contributor

Posting Darren Lo's response for the ML:

 

Hi Joe,

 
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
 
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
 
Thanks,
Darren

View solution in original post

2 REPLIES 2
Highlighted

Re: YARN: Client config for remote submission

New Contributor

Posting Darren Lo's response for the ML:

 

Hi Joe,

 
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
 
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
 
Thanks,
Darren

View solution in original post

Highlighted

Re: YARN: Client config for remote submission

Explorer

I fix this by adding following line:

export CDH_MR2_HOME=$HADOOP_HOME

in .bash_profile

Don't have an account?
Coming from Hortonworks? Activate your account here