Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark History Logs Are Not Enabled with Oozie Spark Action

Spark History Logs Are Not Enabled with Oozie Spark Action

I am trying to follow this instructions to enable history logs with Spark Oozie action.

To ensure that your Spark job shows up in the Spark History Server, make sure to specify these three Spark configuration properties either in spark-opts with --conf or from oozie.service.SparkConfigurationService.spark.configurations

1. spark.yarn.historyServer.address=http://SPH-HOST:18088
2. spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory
3. spark.eventLog.enabled=true

Workflow defintion looks like this:

<action name="spark-9e7c">
<spark xmlns="uri:oozie:spark-action:0.1">
<name>Correlation Engine</name>
<class>Main Class</class>
<jar>hdfs://<MACHINE IP>:8020/USER JAR</jar>
<spark-opts> --conf spark.eventLog.dir=<MACHINE IP>:8020/user/spark/applicationHistory --conf spark.eventLog.enabled=true --conf spark.yarn.historyServer.address=<MACHINE IP>:18088/</spark-opts>
<ok to="email-f5d5"/>
<error to="email-a687"/>

When I test from a shell script history logs are logged correctly but with Oozie actions logs are not logged correctly. I have set all the three propeties.

Don't have an account?
Coming from Hortonworks? Activate your account here