Created 12-17-2015 07:36 AM
I'm currently exploring Oozie's SparkAction, but I'm running into errors.
The code is pretty straightforward; it's just a simple select from a Hive table then I count the records of the Dataframe. It's just some simple dummy code to use while I learn how to work with Oozie:
val tbl = sqlContext.sql("SELECT * FROM tbl")
val count = tbl.count
log.info(s"The table has ${count} records.")It works as expected when using `spark-submit` but when trying to run it as an Oozie SparkAction, I get the following error in the logs:
Main class: org.apache.spark.deploy.yarn.Client Arguments: --name Testing Spark Action --jar hdfs://myhost.com:8020/user/bigdata/workflows/sparkaction-test/lib/sparkaction-test_2.10-1.0.jar --class com.myCompany.SparkActionTest System properties: SPARK_SUBMIT -> true spark.app.name -> Testing Spark Action spark.submit.deployMode -> cluster spark.master -> yarn-cluster Classpath elements: Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1454025267777_0681 finished with failed status org.apache.spark.SparkException: Application application_1454025267777_0681 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:974) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1020) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:685) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly.
The project directory is arranged as follows:
sparkaction-test -workflow.xml -hive-site.xml -job.properties -lib/ -sparkaction-test_2.10-1.0.jar
The content of job.properties:
nameNode=hdfs://myhost.com:8020
jobTracker=myhost.com:8032
queueName=default
projectRoot=user/${user.name}/workflows/sparkaction-test
master=yarn-cluster
mode=cluster
class=com.myCompany.SparkActionTest
hiveSite=hive-site.xml
jars=${nameNode}/${projectRoot}/lib/sparkaction-test_2.10-1.0.jar
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/${projectRoot}
spark.yarn.historyServer.address=http://myhost.com:18080/
spark.eventLog.dir=${nameNode}/user/spark/applicationHistory
spark.eventLog.enabled=trueworkflow.xml:
<workflow-app name="spark-test-wf" xmlns="uri:oozie:workflow:0.4">
<start to="spark-test"/>
<action name="spark-test">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<master>${master}</master>
<mode>${mode}</mode>
<name>Testing Spark Action</name>
<class>${class}</class>
<jar>${jars}</jar>
</spark>
<ok to="end"/>
<error to="errorcleanup" />
</action>
<kill name="errorcleanup">
<message>Spark Test WF failed. [${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name ="end"/>
</workflow-app>These are the jars in the Oozie sharelib:
Environment:
What could be the problem?
Created 01-28-2016 08:26 AM
- regarding port 8032 absolutely! See : this thread
Created 08-05-2016 05:39 AM
Can you please share below jar
Created 08-13-2016 11:31 PM