Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oozie Spark (2.x) action is always getting stuck at accepted state

Oozie Spark (2.x) action is always getting stuck at accepted state

New Contributor

When I run the spark job through oozie it always getting stuck at the accepted state. I followed the hornwork doc to set up the spark2 libraries.

When I use oozie shell action for the same spark job it works perfectly fine and through the spark-submit from edge node as well with the same spark opts.

Below is my workflow.xml

<global>
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <configuration>
        <property>
            <name>mapreduce.job.queuename</name>
            <value>${queueName}</value>
        </property>
    </configuration>
</global>
<credentials>
    <credential name="HiveCreds" type="hive2">
        <property>
            <name>hive2.jdbc.url</name>
            <value>jdbc:hive2://${hive2_server}:${hive2_port}/default</value>
        </property>
        <property>
            <name>hive2.server.principal</name>
            <value>hive/${hive2_server}@DOMAIN</value>
        </property>
    </credential>
</credentials>
<!--spark action using spark 2 libraries -->
<start to="SPARK2JOB" />
<action name="SPARK2JOB" cred="HiveCreds">
    <spark
        xmlns="uri:oozie:spark-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <master>${master}</master>
        <mode>${mode}</mode>
        <name>${appName}</name>
        <class>${mainClass}</class>
        <jar>${hdfsJarLoc}${uberJar}</jar>
        <spark-opts>--num-executors ${noOfExec}
        --executor-cores ${execCores}
        --executor-memory ${execMem}
        --driver-memory ${drivMem}
        --driver-cores ${drivCores}
        --conf spark.dynamicAllocation.enabled=${dynamicAllocation}</spark-opts>
        <arg>${sourceFilePath}</arg>
        <arg>${sourceFileName}</arg>
        <arg>${outputFilePath}</arg>
        <arg>${outputFileDir}</arg>
    </spark>
    <ok to="end" />
    <error to="errorHandler" />
</action>

My job.properties

jobTracker=HOST:8050
nameNode=hdfs://HOST:8020
hive2_server=HOSTNAME
hive2_port=10000
queueName=default

# Standard useful properties
oozie.use.system.libpath=true
#oozie.wf.rerun.failnodes=true

ooziePath=/path/

#oozie.coord.application.path=${ooziePath}

## Oozie path & Standard properties
oozie.wf.application.path=${ooziePath}
oozie.libpath = ${ooziePath}/Lib
oozie.action.sharelib.for.spark=spark2

master=yarn-cluster
mode=cluster
appName=APP_NAME
mainClass=MAIN_CLASS
uberJar=UBER_JAR
noOfExec=2
execCores=2
execMem=2G
drivMem=2g
drivCores=2
dynamicAllocation=false

I checked the oozie spark2 libraries I have all the jars that are in the /usr/hdp/2.6.3.0-235/spark2/jars/

My oozie lib :

/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-core-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-kms-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/aws-java-sdk-s3-1.10.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-data-lake-store-sdk-2.1.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-keyvault-core-0.8.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/azure-storage-5.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/commons-lang3-3.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/guava-11.0.2.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-aws-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-azure-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/hadoop-azure-datalake-2.7.3.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-annotations-2.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-core-2.4.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/jackson-databind-2.4.4.jar
/user/oozie/share/lib/lib_20180116141700/oozie/joda-time-2.9.6.jar
/user/oozie/share/lib/lib_20180116141700/oozie/json-simple-1.1.jar
/user/oozie/share/lib/lib_20180116141700/oozie/okhttp-2.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/okio-1.4.0.jar
/user/oozie/share/lib/lib_20180116141700/oozie/oozie-hadoop-utils-hadoop-2-4.2.0.2.6.3.0-235.jar
/user/oozie/share/lib/lib_20180116141700/oozie/oozie-sharelib-oozie-4.2.0.2.6.3.0-235.jar

Below is the ERROR stack :

It will get stuck in the ACCEPTED state (something like below) for an hour or so

 [main] INFO  org.apache.spark.deploy.yarn.Client  - Application report for application_1537404298109_2008 (state: ACCEPTED)

STDOUT :

INFO  org.apache.spark.deploy.yarn.Client  - Application report for application_1537404298109_2008 (state: ACCEPTED)
2018-09-20 14:49:15,158 [main] INFO  org.apache.spark.deploy.yarn.Client  - Application report for application_1537404298109_2008 (state: FAILED)
2018-09-20 14:49:15,158 [main] INFO  org.apache.spark.deploy.yarn.Client  - 
     client token: N/A
     diagnostics: Application application_1537404298109_2008 failed 2 times due to AM Container for appattempt_1537404298109_2008_000002 exited with  exitCode: -1000
For more detailed output, check the application tracking page: http://hostname:8088/cluster/app/application_1537404298109_2008 Then click on links to logs of each attempt.
Diagnostics: org.apache.hadoop.security.authorize.AuthorizationException: User:yarn not allowed to do 'DECRYPT_EEK' on 'testkey1'
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1537468694601
     final status: FAILED
     tracking URL: http://hostname:8088/cluster/app/application_1537404298109_2008
     user: username
2018-09-20 14:49:16,189 [main] INFO  org.apache.spark.deploy.yarn.Client  - Deleted staging directory

<<< Invocation of Spark command completed <<<<<< Invocation of Spark command completed <<<

Hadoop Job IDs executed by Spark: job_1537404298109_2008
<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1537404298109_2008 finished with failed status
org.apache.spark.SparkException: Application application_1537404298109_2008 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:314)
    at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:235)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
    at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)

Oozie Launcher failed, finishing Hadoop job gracefully

STDERR :

 org.apache.spark.SparkException: Application application_1537404298109_2008 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:314)
    at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:235)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
    at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

I tried all the possible solutions found in the hortonworks community and stack for the same type of issue, but nothing worked for me. If you need any other information to help me I'm more than happy to add it to the question.

Thanks in advance !!!

4 REPLIES 4
Highlighted

Re: Oozie Spark (2.x) action is always getting stuck at accepted state

Seems like the user 'yarn' doesn't have permissions on the encryption zone . Do grant the user yarn necessary permissions and re-schedule the job.

Hope this helps.

Highlighted

Re: Oozie Spark (2.x) action is always getting stuck at accepted state

New Contributor

@Sandeep Nemuri but how come it is working through shell action and with spark-submit with same spark opts (cluster mode)?

Highlighted

Re: Oozie Spark (2.x) action is always getting stuck at accepted state

Are you submitting the app with the same user across all the tests?

Highlighted

Re: Oozie Spark (2.x) action is always getting stuck at accepted state

New Contributor

Yes, I'm using the same user.

Don't have an account?
Coming from Hortonworks? Activate your account here