Member since
09-20-2016
2
Posts
0
Kudos Received
0
Solutions
09-20-2016
06:52 AM
Thank you for the response! I'm sorry I forgot to specify that I already have Yarn on my cluster. I'm running the spark job fine using the spark-submit --master yarn --deploy-mode cluster via terminal. However when I run an oozie workflow on it, oozie failed with the error above. Do you mean that I need to move my Oozie Launcher to use MRv2 / Yarn?
... View more
09-20-2016
06:34 AM
Hello, I'm currently learning to use Spark Action with Oozie using CDH 5.8. I'm running the workflow fine with master=local[*] and mode=client. However, it's seems very different with Yarn Client/Cluster. When I run the job, I got: 2016-09-20 06:04:14,028 WARN org.apache.oozie.action.hadoop.SparkActionExecutor: SERVER[master.meshiang] USER[root] GROUP[-] TOKEN[] APP[CSV] JOB[0000007-160920052847518-oozie-oozi-W] ACTION[0000007-160920052847518-oozie-oozi-W@spark-2bab] Launcher exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
java.lang.Exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:251)
at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:228)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:109)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:256)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:207)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.Child.main(Child.java:262) I know I have to specify HADOOP_CONFIG_DIR and YARN_CONFIG_DIR. But How and Where? What I already tried: Following the spark-opt onfiguration from : http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SparkActionExtension.html#Spark_on_YARN. In the Spark Action > Options tab in Hue, I put the following configuration: --conf spark.yarn.historyServer.address=http://datanode1.meshiang:18088
--conf spark.eventLog.dir=${nameNode}/user/spark/applicationHistory
--conf spark.eventLog.enabled=true I don't know if this seems neccesary when this feature is already included in CDH 5.7.2 [OOZIE-2170] Specifying HADOOP_CONFIG_DIR and YARN_CONFIG_DIR at the oozie server node using export HADOOP_CONFIG_DIR=/etc/hadoop/conf
export YARN_CONFIG_DIR=/etc/hadoop/conf Specifying HADOOP_CONFIG_DIR and YARN_CONFIG_DIR in the Spark Action spark-opts --conf spark.yarn.appMasterEnv.HADOOP_CONFIG_DIR=/etc/hadoop/conf
--conf spark.yarn.appMasterEnv.YARN_CONFIG_DIR=/etc/hadoop/conf PS : I'm using the Oozie, Spark and MRv1 (for running Oozie Launcher) from CDH 5.8 without changing any of its specification. _
... View more
Labels:
- Labels:
-
Apache Oozie