Support Questions

prasannasaraf18 · ‎04-30-2018

I am trying to setup spark on yarn cluster using Hadoop cookbook using HDP distribution. As part of this i am using yarn-env.sh to configure yarn resourcemanager.

I have the following in yarn-env.sh

export YARN_RESOURCEMANAGER_OPTS="-Dyarn.resourcemanager.hostname=192.168.33.33"

I am able to see the cluster on the http://192.168.33.33:8088/cluster/nodes. This also shows 2 nodemanagers connected to it. But when i run

yarn node --list

it tries to connect to 0.0.0.0 and gives the following log

18/04/30 06:09:29 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

18/04/30 06:09:31 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

After some retries it fails.

The same error is received when i run spark-submit using following command

spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --executor-memory 1G /usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar

as spark uses $HADOOP_CONF_DIR for getting yarn configurations.

What is the cause. Does yarn command reads only from yarn-site.xml and not yarn-env.sh

TarunParimi · ‎05-01-2018

yarn-env.sh is used when you run any yarn command. So it works if you use the yarn command to submit a mapreduce job as below.

yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 5 5

But spark-submit command doesn't invoke yarn-env.sh, so it will read the yarn-site.xml from $HADOOP_CONF_DIR and gets resourcemanager address from it.

prasannasaraf18 · ‎05-03-2018

@Tarun Parimi But yarn commands are also trying to connect to 0.0.0.0. Instead of resource manager IP Address. This happens on both master and slave machines.

TarunParimi · ‎05-03-2018

I didn't notice that you were only setting YARN_RESOURCEMANAGER_OPTS. This env variable is used for only the resourcemanger daemon. So to specify the opts for all hadoop and yarn client commands, you can use HADOOP_CLIENT_OPTS in . hadoop-env.sh .

export HADOOP_CLIENT_OPTS="-Dyarn.resourcemanager.hostname=192.168.33.33"

But I am not sure why you would need to this when you can just set it in the yarn-site.xml, which is what is recommended.

Cloudera Community

Support Questions

Yarn commands not reading resource manager hostname from yarn-env.sh