Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Spark action always submit to 0.0.0.0:8032

I'm trying to submit a spark action using Oozie with master=yarn-cluster and mode=cluster but the job hangs trying to submit to RM 0.0.0.0:8032

In the YARN logs of the application there are the the following 2 lines:

org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at hdp24node00/192.168.178.30:8050
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
...
...
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

It first try to connect to the right resource manager, but then defaults to 0.0.0.0:8032 until it fails.

yarn.resourcemanager.address is set to hdp24node00:8050 and job-tracker property is also set to hdp24node00:8050.

Where is the problem?

1 ACCEPTED SOLUTION

Mentor

@Nicola Marangoni

@David Tam @Breandán Mac Parland

Oozie Spark action is not supported in HDP 2.4 or earlier. There are workarounds on this site you may find but they're not supported by Hortonworks. This means you can get it to work but you can't open tickets in regards to them with our support. Can you set jobTracker to port 8032 instead of 8050 and try again? Look at this question as it's similar https://community.hortonworks.com/questions/11599/how-to-change-resourcemanager-port-on-oozie.html#a...

Hortonworks uses the source code from Apache Oozie and internally Apache Oozie version expects port 8032 for yarn.resourcemanager.address not 8050. Look at the unit test example https://github.com/apache/oozie/search?utf8=%E2%9C%93&q=8032

in our private repos, this is an identical unit test. You guys all have similar questions, I will raise an internal discussion on how to proceed with this.

Here's a note on Spark action in HDP http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_spark-guide/content/ch_introduction-spark...

View solution in original post

11 REPLIES 11

Mentor

@Nicola Marangoni

@David Tam @Breandán Mac Parland

Oozie Spark action is not supported in HDP 2.4 or earlier. There are workarounds on this site you may find but they're not supported by Hortonworks. This means you can get it to work but you can't open tickets in regards to them with our support. Can you set jobTracker to port 8032 instead of 8050 and try again? Look at this question as it's similar https://community.hortonworks.com/questions/11599/how-to-change-resourcemanager-port-on-oozie.html#a...

Hortonworks uses the source code from Apache Oozie and internally Apache Oozie version expects port 8032 for yarn.resourcemanager.address not 8050. Look at the unit test example https://github.com/apache/oozie/search?utf8=%E2%9C%93&q=8032

in our private repos, this is an identical unit test. You guys all have similar questions, I will raise an internal discussion on how to proceed with this.

Here's a note on Spark action in HDP http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_spark-guide/content/ch_introduction-spark...

Mentor

It's a bug, correct port is 8032, we filed a fix, thanks for letting us know!

Explorer

Hi,

I also faced the same issue. But when I copied *-site.xml from /etc/hadoop/conf to hadoop spark lib directory (in job.properties), the oozie spark job ran successfully.

My RM is IP:8032 in Ambari and in yarn-site.xml

Thanks

Mentor

@Arun Sundar Selvan thanks for confirming that. We knew it worked with port 8032, just want to be clear you understand Oozie Spark action in HDP is not supported at this time. There are workarounds but nothing we put a stamp of approval on yet.

@Arun Sundar Selvan: with a symlink

ln -s /etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/secure/yarn-site.xml

it works even with port 8050 or whatever.

Sadly the are 2 new issues (see comment below), but many thanks for the tip!

With the following symlink (see anwer of @Arun Sundar Selvan😞

ln -s /etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/secure/yarn-site.xml

the port number is no mode an issue, however I get 2 new issues:

  • /ats/active is not found, I have to creare it manually after restarting YARN:
mkdir /ats
mkdir /ats/active
mkdir /ats/done
chmod -R 777 /ats
  • Localization doesn't work: I have 3 nodes but in every node the spark-assembly library is in a different folder:
cd /hadoop/yarn/local
find -name '*assembly*'

On node 0:

./filecache/493/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/493/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
./filecache/466/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/466/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

On node 1:

./filecache/452/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/452/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

On node 2:

./filecache/265/.tmp_spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar.crc
./filecache/265/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

However, as the YARN job is submitted by the RM on node 00 to another node, it search the libraries with the path on node 00 assuming the directory number under "filecache" is always the same.

Best regards,

Nicola

Explorer

Can you please share the stdout log from the Job-tracker.

Did anyone manage to get any further with this?

New Contributor

@Nicola Marangoni

@Arun Sundar Selvan

Even I am facing similar issue reported by Nicola

"However, as the YARN job is submitted by the RM on node 00 to another node, it search the libraries with the path on node 00 assuming the directory number under "filecache" is always the same."

Any pointers on this please to get this resolved?

Regards,

Mahadevan

,

Even I am observing the similar issue as observed by Nicola. Can somebody help me in resolving this issue?

New Contributor

I am running HDP 2.4.2 and here is the stack trace from yarn node manager logs

2016-06-30 12:17:23,376 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(711)) - Created localizer for container_e11_1467282483822_0010_02_000001

2016-06-30 12:17:23,378 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(866)) - Failed to download resource { { file:/hadoop/yarn/local/filecache/117/spark-assembly-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar, 1467269880000, FILE, null },pending,[(container_e11_1467282483822_0010_02_000001)],22592100725522,DOWNLOADING}

java.io.FileNotFoundException: File file:/hadoop/yarn/local/filecache/117/spark-assembly-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar does not exist

at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)

at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)

at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)

at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)

at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

New Contributor

We've three nodes cluster of HDP 2.4 stack. We are trying to run spark jobs using oozie workflow, did all the workaround mentioned in this post as well as posted in other Hortonworks community. But still we are getting error:

"Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Call From ip-172-63-10-15.eu-west-1.compute.internal/172.63.10.15 to 0.0.0.0:8032 failed on co nnection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused java.net.ConnectException: Call From ip-172-63-10-15.eu-west-1.compute.internal/172.63.10.15 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused"

Any suggestion, what we are missing and what settings needs to done?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.