Created 06-11-2018 01:13 PM
Hi all, I want to create Oozie workflow for Spark Action. Anyways, I have created workflow but I am getting the following error:
Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1]
I searched up on the error, I saw that most common cause of this error is oozie sharelib. So I installed all new jars and update sharelib by running following command:
su oozie oozie admin -sharelibupdate
Ensure that sharelib is installed properly and none none of this has stopped the error occurring.
My workflow files are following below;
job.properties
nameNode=hdfs://sandbox.hortonworks.com:8020 jobTracker=sandbox.hortonworks.com:8050 queueName=default projectRoot=user/root/oozie/sparkoozie master=local[2] mode=cluster class=org.apache.TransformationOper hiveSite=hive-site.xml workflowAppUri=${nameNode}/${projectRoot}/lib/TransformationOper.jar oozie.use.system.libpath=true oozie.action.sharelib.for.spark=spark,hive oozie.wf.application.path=${nameNode}/${projectRoot}/
workflow.xml
<workflow-app name="SparkAction" xmlns="uri:oozie:workflow:0.4"> <start to="spark-node"/> <action name="spark-node"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${projectRoot}/output"/> </prepare> <job-xml>${nameNode}/${projectRoot}/hive-site.xml</job-xml> <configuration> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> </configuration> <master>${master}</master> <mode>${mode}</mode> <name>Testing Spark Action</name> <class>${class}</class> <jar>${nameNode}/${projectRoot}/lib/TransformationOper.jar</jar> <arg>INPUT=${nameNode}/${projectRoot}/input/error.log</arg> <arg>OUTPUT=${projectRoot}/output</arg> </spark> <ok to="end"/> <error to="error"/> </action> <kill name="error"> <message>Spark Test WF failed. [${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name ="end"/> </workflow-app>
command:
oozie job -oozie http://127.0.0.1:11000/oozie -config job.properties -run
I also checked yarn log
yarn logs -applicationId application_1528173243110_0007
following is logerror;
LogType:stderr Log Upload Time:Tue Jun 05 16:31:27 +0000 2018 LogLength:2329 Log Contents: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/685/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/25/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Using properties file: null Parsed arguments: master local[2] deployMode cluster executorMemory null executorCores null totalExecutorCores null propertiesFile null driverMemory null driverCores null driverExtraClassPath null driverExtraLibraryPath null driverExtraJavaOptions -Dlog4j.configuration=spark-log4j.properties supervise false queue null numExecutors null files null pyFiles null archives null mainClass org.apache.TransformationOper primaryResource hdfs://sandbox.hortonworks.com:8020/user/root/oozie/sparkoozie/lib/TransformationOper.jar name Testing Spark Action childArgs [INPUT=hdfs://sandbox.hortonworks.com:8020/user/root/oozie/sparkoozie/input/error.log OUTPUT=user/root/oozie/sparkoozie/output] jars null packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file null: spark.yarn.security.tokens.hive.enabled -> false spark.executor.extraJavaOptions -> -Dlog4j.configuration=spark-log4j.properties spark.yarn.security.tokens.hbase.enabled -> false spark.driver.extraJavaOptions -> -Dlog4j.configuration=spark-log4j.properties Error: Cluster deploy mode is not compatible with master "local" Run with --help for usage help or --verbose for debug output Intercepting System.exit(1) Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1] End of LogType:stderr
I went thru it, but I m not getting this what is the exact error. It is very grateful If you could help me to solve this issue.
Regards,
Jay.
Created 06-11-2018 05:33 PM
The error message you see is very generic. When dealing with this type of errors if possible you could set yarn.nodemanager.delete.debug-delay-sec=600 this will give you some time to go to the actual node where this is failing and dig into the yarn local dir to hopefully find the actual cause for the job to fail. Check under /hadoop/yarn/local/usercache for the application id and any log files that could potentially lead to a better understanding of the problem.
HTH
Created 06-12-2018 07:22 AM
HI,
I have updated yarn-site.xml with yarn.nodemanager.delete.debug-delay-sec=600 property. But now I am facing following error;
18/06/12 06:25:32 ERROR ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
and yes I have changed master=local[2] to master=yarn in job.properties as this question explained.
Regards,
Jay.
Created 06-12-2018 11:59 AM
@JAy PaTel The above means that during startup of the spark application the sparkcontext was not initialized. I've seen this particular error in cases where application code is doing something previous to creating the sparkcontext/sparksession and this pre-creation of sparkcontext code piece is delayed (for any reasons) leading to this issue. I recommend you review your code with detail. Take oozie out of the picture. Also try run it in yarn-cluster and/or yarn-client mode, perhaps it will fail as well and this will simplify the troubleshooting.
HTH
Created 06-13-2018 08:00 AM
Thanks for respond.
I have verified my code. It is working well. I could execute spark-submit using a command line.
./bin/spark-submit --class com.apache.<ClassName> --master local[2] /root/<my_jar.jar> /<input_path_of_HDFS> /<output_path_of_HDFS>
But with SparkAction it is giving me errors.
Regards,
Jay.
Created 06-13-2018 12:02 PM
@JAy PaTel Try to run in yarn cluster mode and see if your application runs fine then.
./bin/spark-submit --class com.apache.<ClassName> --master yarn-cluster /root/<my_jar.jar> /<input_path_of_HDFS> /<output_path_of_HDFS>
Created 06-13-2018 12:55 PM
No, I meant to say, I have reviewed my code and took Oozie of out picture. After that, I could able run spark-submit command. My command is also working well. But the problem is with SparkAction for Oozie.
Regards,
Jay.
Created 06-12-2018 07:22 PM
It says :
Error:Cluster deploy mode is not compatible with master
below would be correct parameters
master=yarn
mode=cluster