Support Questions
Find answers, ask questions, and share your expertise

JobTracker job.properties Setting Not Being respected in Oozie

Highlighted

JobTracker job.properties Setting Not Being respected in Oozie

I'm an issue with Oozie, whereby I set the jobTracker setting to one value (port 8050), but the error logs show the workflow failing as it is trying to use another value (8032). The jobTracker port in my workflow/job.properties is set to 8050 (to match the yarn setting) and I can see in the oozie UI (click on job > action > action configuration) that 8050 is being used:

job.properties

nameNode=hdfs://myDomain:8020
jobTracker=myOtherDomain:8050
queueName=default
master=yarn # have also tried yarn-cluster and yarn-client

oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/bmp/
oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I need in here


workflow

<workflow-app xmlns='uri:oozie:workflow:0.5' name='MyWorkflow'>
    <start to='spark-node' />
    <action name='spark-node'>
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/bmp/output"/>
            </prepare>
            <master>${master}</master>
            <name>My Workflow</name>
            <class>uk.co.bmp.drivers.MyDriver</class>
            <jar>${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar</jar>
            <spark-opts>--conf spark.yarn.historyServer.address=http://myDomain:18088 --conf spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf spark.eventLog.enabled=true</spark-opts>
            <arg>${nameNode}/bmp/input/input_file.csv</arg>
        </spark>
        <ok to="end" />
        <error to="fail" />
    </action>
    <kill name="fail">
        <message>Workflow failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}]
        </message>
    </kill>
    <end name='end' />
</workflow-app>

But when I drill down into the hadoop job history logs I see the error:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused. For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

Where is it pulling 8032 from? Why does it not use the port configured in the job.properties?

"The workflow only started to work once I changed both of these to 8032"

I'd rather not do this, given it is more of a workaround rather than fixing/understanding the issue and the change could have repercussions on other tools. Is there a way to configure it just for oozie and get it to respect the port in the job.properties?

Related Questions

6 REPLIES 6
Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

Mentor

@Breandán Mac Parland Oozie expects port 8032 as yarn application manager interface, I will raise an internal question why our docs point to 8050 and Oozie expects 8032 but here's the unit test from Oozie project and our repo uses the same code which means internally Oozie expects 8032. https://github.com/hortonworks/oozie-monarch/search?utf8=%E2%9C%93&q=8032

Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

Mentor

It's a bug, correct port is 8032, we filed a fix, thanks for letting us know!

Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

Cheers Artem for the confirmation. Is this a bug in Oozie, HDP or Spark? I have raised this issue here, is this the correct place? Is there a way that I can track the issue/fix you have filed, like a Jira page/github issue?

Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

Mentor

It's a bug in our documentation not in any of the products. Spark Oozie action is not supported in hdp but does work in Apache, there are workarounds available albeit not supported by us. If you don't mind accepting the answer, thanks.

Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

This question is essentially the same as this one. So I'm happy if this one is closed and the discussion continued there. However I don't see how this is a documentation issue. From the other question:

The default port for the yarn resource manager is 8032. This however is configurable and a default set-up for HDP sets this to port 8050. Since this is configurable, Oozie should read this value from a config. file or allow this to be passed in as a setting. It looks like it does allow this using the job-tracker property of the spark action, but when you set that to point at 8050, it is ignored during the running of the workflow and instead it defaults to 8032. So it looks like this is an issue in with Oozie not using the settings provided in the workflow configuration.

Since the issue looks to be with Oozie's handling of its settings, then changing the address of the yarn resource manager from the default HDP setting would be a workaround to an Oozie problem. I've raised this here in the Oozie issue tracker.

Highlighted

Re: JobTracker job.properties Setting Not Being respected in Oozie

Mentor

I confirmed with documentation and engineering teams specifically for Spark and Oozie action. I cannot post the bug jira as it's internal. We have documentation inconsistency between document versions. The correct port is 8032. Port 8050 in our docs is incorrect. I also provided you a unit test which explicitly tests for 8032, not 8050. I will close the question.

Don't have an account?