Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
YARN: Client config for remote submission
Labels:
New Contributor
Created on ‎10-25-2013 06:26 AM - edited ‎09-16-2022 01:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(cross posting this from the scm-user ML, which I find easier to use... sorry if this is bad form)
Hello folks,
I recently upgraded one of my clusters from MR1 to YARN.
In the past, I was able to download the client configs from CM onto a development machine, such that I could create a hadoop-conf alternative using this client config that allowed me to access the cluster remotely (i.e. from the dev machine).
However, I'm having trouble doing the same thing now that the YARN service is active.
I download the YARN client config and unzip it, and I notice there are some new files.
In particular, I see a file "mrapp-generated-classpath" which contains the content:
{{CDH_HADOOP_HOME}}/*:{{CDH_HADOOP_HOME}}/lib/*
I'm not too familiar with this syntax, but:
* I cannot source this file in a shell
* I am not sure where CDH_HADOOP_HOME is supposed to be defined.
In any event, in attempting to run the YARN example wordcount I get:
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/lib/partition/InputSampler$Sampler at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2521) at java.lang.Class.getMethod0(Class.java:2764) at java.lang.Class.getMethod(Class.java:1653) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:60) at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:103) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.lib.partition.InputSampler$Sampler at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 12 more
which I expect is a classpath/env issue.
I try manually exporting CDH_HADOOP_HOME and HADOOP_MAPRED_HOME to /usr/lib/hadoop but this isn't working either.
Anyone have any ideas?
My dev machine is Quantal with CDH 4.4.
Thanks
1 ACCEPTED SOLUTION
New Contributor
Created ‎10-25-2013 09:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Posting Darren Lo's response for the ML:
Hi Joe,
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
Thanks,
Darren
2 REPLIES 2
New Contributor
Created ‎10-25-2013 09:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Posting Darren Lo's response for the ML:
Hi Joe,
CM had a bug where we were using {{VARIABLE}} in some files that could be downloaded, as you discovered. This is not normal shell script syntax. CM commands that dealt with these files would do the substitution, on the destination machine, but nothing helps you do this for downloaded client configs. This is fixed in CDH5 beta 1 (OPSAPS-16086).
To get around this, you can do one of:
1) Manually replace the {{CDH_HADOOP_HOME}} in hadoop-env.sh with the correct value for your machine. For YARN installed by packages to the default location, the HADOOP_MAPRED_HOME should be /usr/lib/hadoop-mapreduce/.
2) Add the target machine as a CM-managed host, assign gateway roles for all services that need client config (YARN in your case), and run the Deploy Client Configuration command, which will correctly update /etc/hadoop/conf and substitute the variables for you.
3) Wait for CM 5, which will fall back to standard environment variables or the default package install location if the {{VARIABLE}} isn't substituted correctly.
Thanks,
Darren
Contributor
Created ‎01-04-2016 03:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I fix this by adding following line:
export CDH_MR2_HOME=$HADOOP_HOME
in .bash_profile
