Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark action always submit to 0.0.0.0:8032

avatar

I'm trying to submit a spark action using Oozie with master=yarn-cluster and mode=cluster but the job hangs trying to submit to RM 0.0.0.0:8032

In the YARN logs of the application there are the the following 2 lines:

org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at hdp24node00/192.168.178.30:8050
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
...
...
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

It first try to connect to the right resource manager, but then defaults to 0.0.0.0:8032 until it fails.

yarn.resourcemanager.address is set to hdp24node00:8050 and job-tracker property is also set to hdp24node00:8050.

Where is the problem?

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Nicola Marangoni

@David Tam @Breandán Mac Parland

Oozie Spark action is not supported in HDP 2.4 or earlier. There are workarounds on this site you may find but they're not supported by Hortonworks. This means you can get it to work but you can't open tickets in regards to them with our support. Can you set jobTracker to port 8032 instead of 8050 and try again? Look at this question as it's similar https://community.hortonworks.com/questions/11599/how-to-change-resourcemanager-port-on-oozie.html#a...

Hortonworks uses the source code from Apache Oozie and internally Apache Oozie version expects port 8032 for yarn.resourcemanager.address not 8050. Look at the unit test example https://github.com/apache/oozie/search?utf8=%E2%9C%93&q=8032

in our private repos, this is an identical unit test. You guys all have similar questions, I will raise an internal discussion on how to proceed with this.

Here's a note on Spark action in HDP http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_spark-guide/content/ch_introduction-spark...

View solution in original post

11 REPLIES 11

avatar
Master Mentor

@Nicola Marangoni

@David Tam @Breandán Mac Parland

Oozie Spark action is not supported in HDP 2.4 or earlier. There are workarounds on this site you may find but they're not supported by Hortonworks. This means you can get it to work but you can't open tickets in regards to them with our support. Can you set jobTracker to port 8032 instead of 8050 and try again? Look at this question as it's similar https://community.hortonworks.com/questions/11599/how-to-change-resourcemanager-port-on-oozie.html#a...

Hortonworks uses the source code from Apache Oozie and internally Apache Oozie version expects port 8032 for yarn.resourcemanager.address not 8050. Look at the unit test example https://github.com/apache/oozie/search?utf8=%E2%9C%93&q=8032

in our private repos, this is an identical unit test. You guys all have similar questions, I will raise an internal discussion on how to proceed with this.

Here's a note on Spark action in HDP http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_spark-guide/content/ch_introduction-spark...

avatar
Master Mentor

It's a bug, correct port is 8032, we filed a fix, thanks for letting us know!

avatar
Explorer

Hi,

I also faced the same issue. But when I copied *-site.xml from /etc/hadoop/conf to hadoop spark lib directory (in job.properties), the oozie spark job ran successfully.

My RM is IP:8032 in Ambari and in yarn-site.xml

Thanks

avatar
Master Mentor

@Arun Sundar Selvan thanks for confirming that. We knew it worked with port 8032, just want to be clear you understand Oozie Spark action in HDP is not supported at this time. There are workarounds but nothing we put a stamp of approval on yet.

avatar

@Arun Sundar Selvan: with a symlink

ln -s /etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/secure/yarn-site.xml

it works even with port 8050 or whatever.

Sadly the are 2 new issues (see comment below), but many thanks for the tip!

avatar

With the following symlink (see anwer of @Arun Sundar Selvan😞

ln -s /etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/secure/yarn-site.xml

the port number is no mode an issue, however I get 2 new issues:

  • /ats/active is not found, I have to creare it manually after restarting YARN:
mkdir /ats
mkdir /ats/active
mkdir /ats/done
chmod -R 777 /ats
  • Localization doesn't work: I have 3 nodes but in every node the spark-assembly library is in a different folder:
cd /hadoop/yarn/local
find -name '*assembly*'

On node 0:

./filecache/493/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/493/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
./filecache/466/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/466/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

On node 1:

./filecache/452/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/452/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

On node 2:

./filecache/265/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/265/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

However, as the YARN job is submitted by the RM on node 00 to another node, it search the libraries with the path on node 00 assuming the directory number under "filecache" is always the same.

Best regards,

Nicola

avatar
Explorer

Can you please share the stdout log from the Job-tracker.

avatar
Contributor

Did anyone manage to get any further with this?

avatar
New Contributor

@Nicola Marangoni

@Arun Sundar Selvan

Even I am facing similar issue reported by Nicola

"However, as the YARN job is submitted by the RM on node 00 to another node, it search the libraries with the path on node 00 assuming the directory number under "filecache" is always the same."

Any pointers on this please to get this resolved?

Regards,

Mahadevan

,

Even I am observing the similar issue as observed by Nicola. Can somebody help me in resolving this issue?