Reply
Highlighted
Explorer
Posts: 9
Registered: ‎02-05-2016

Spark History Logs Are Not Enabled with Oozie Spark Action

I am trying to follow this instructions to enable history logs with Spark Oozie action.
https://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SparkActionExtension.html

To ensure that your Spark job shows up in the Spark History Server, make sure to specify these three Spark configuration properties either in spark-opts with --conf or from oozie.service.SparkConfigurationService.spark.configurations

1. spark.yarn.historyServer.address=http://SPH-HOST:18088
2. spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory
3. spark.eventLog.enabled=true

Workflow defintion looks like this:

<action name="spark-9e7c">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn-cluster</master>
<mode>cluster</mode>
<name>Correlation Engine</name>
<class>Main Class</class>
<jar>hdfs://<MACHINE IP>:8020/USER JAR</jar>
<spark-opts> --conf spark.eventLog.dir=<MACHINE IP>:8020/user/spark/applicationHistory --conf spark.eventLog.enabled=true --conf spark.yarn.historyServer.address=<MACHINE IP>:18088/</spark-opts>
</spark>
<ok to="email-f5d5"/>
<error to="email-a687"/>
</action>

When I test from a shell script history logs are logged correctly but with Oozie actions logs are not logged correctly. I have set all the three propeties.

Announcements