Created 12-17-2015 07:36 AM
I'm currently exploring Oozie's SparkAction, but I'm running into errors.
The code is pretty straightforward; it's just a simple select from a Hive table then I count the records of the Dataframe. It's just some simple dummy code to use while I learn how to work with Oozie:
val tbl = sqlContext.sql("SELECT * FROM tbl") val count = tbl.count log.info(s"The table has ${count} records.")
It works as expected when using `spark-submit` but when trying to run it as an Oozie SparkAction, I get the following error in the logs:
Main class: org.apache.spark.deploy.yarn.Client Arguments: --name Testing Spark Action --jar hdfs://myhost.com:8020/user/bigdata/workflows/sparkaction-test/lib/sparkaction-test_2.10-1.0.jar --class com.myCompany.SparkActionTest System properties: SPARK_SUBMIT -> true spark.app.name -> Testing Spark Action spark.submit.deployMode -> cluster spark.master -> yarn-cluster Classpath elements: Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1454025267777_0681 finished with failed status org.apache.spark.SparkException: Application application_1454025267777_0681 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:974) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1020) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:685) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly.
The project directory is arranged as follows:
sparkaction-test -workflow.xml -hive-site.xml -job.properties -lib/ -sparkaction-test_2.10-1.0.jar
The content of job.properties:
nameNode=hdfs://myhost.com:8020 jobTracker=myhost.com:8032 queueName=default projectRoot=user/${user.name}/workflows/sparkaction-test master=yarn-cluster mode=cluster class=com.myCompany.SparkActionTest hiveSite=hive-site.xml jars=${nameNode}/${projectRoot}/lib/sparkaction-test_2.10-1.0.jar oozie.use.system.libpath=true oozie.wf.application.path=${nameNode}/${projectRoot} spark.yarn.historyServer.address=http://myhost.com:18080/ spark.eventLog.dir=${nameNode}/user/spark/applicationHistory spark.eventLog.enabled=true
workflow.xml:
<workflow-app name="spark-test-wf" xmlns="uri:oozie:workflow:0.4"> <start to="spark-test"/> <action name="spark-test"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> </configuration> <master>${master}</master> <mode>${mode}</mode> <name>Testing Spark Action</name> <class>${class}</class> <jar>${jars}</jar> </spark> <ok to="end"/> <error to="errorcleanup" /> </action> <kill name="errorcleanup"> <message>Spark Test WF failed. [${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name ="end"/> </workflow-app>
These are the jars in the Oozie sharelib:
Environment:
What could be the problem?
Created 01-28-2016 08:26 AM
- regarding port 8032 absolutely! See : this thread
Created 08-05-2016 05:39 AM
Can you please share below jar
Created 08-13-2016 11:31 PM