Member since
04-03-2019
962
Posts
1743
Kudos Received
146
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 17724 | 03-08-2019 06:33 PM | |
| 7164 | 02-15-2019 08:47 PM |
06-08-2018
12:09 AM
Please follow below steps to run spark2 action via Oozie on HDP clusters. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch_oozie-spark-action.html Your Oozie job may get failed with below error because of jar conflicts between 'oozie' sharelib and 'spark2' sharelib. Error: 2018-06-04 13:27:32,652 WARN SparkActionExecutor:523 - SERVER[XXXX] USER[XXXX] GROUP[-] TOKEN[] APP[XXXX] JOB[0000000-<XXXXX>-oozie-oozi-W] ACTION[0000000-<XXXXXX>-oozie-oozi-W@spark2] Launcher exception: Attempt to add (hdfs://XXXX/user/oozie/share/lib/lib_XXXXX/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache.
java.lang.IllegalArgumentException: Attempt to add (hdfs://XXXXX/user/oozie/share/lib/lib_20170727191559/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache.
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:632)
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:623)
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:623)
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:622)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:622)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:895)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1231)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1290)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:750)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:311)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:232)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:237)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) . Please run below commands to fix this error: Note - You may need to take backup before running rm commands. hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/aws*
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/azure*
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-aws*
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-azure*
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/ok*
hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson* /user/oozie/share/lib/lib_<ts>/oozie.old . Please run below command to update Oozie sharelib: oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate . Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!
... View more
Labels:
04-19-2018
09:40 PM
Thanks @Chad Woodhead - Updated! 🙂
... View more
04-09-2018
06:10 PM
1 Kudo
@Harendra Sinha - No Oozie WebUI is readonly. You can have a look at Workflow manager from Ambari which has great features to design/run/re-run oozie workflows.
... View more
12-29-2017
02:35 AM
Tip! 🙂 Please make sure to add below line in hbase-indexer-env.sh in order to avoid org.apache.zookeeper.KeeperException$NoAuthException:KeeperErrorCode=NoAuthfor/hbase-secure/blah blah error HBASE_INDEXER_OPTS="$HBASE_INDEXER_OPTS -Djava.security.auth.login.config=<path-of-indexer-jass-file>"
... View more
10-31-2017
09:28 PM
@amarnath reddy pappu - I believe we will have to regenerate war file in secure mode and restart Oozie service again. su -l oozie -c "/usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war -secure"
... View more
10-27-2017
10:27 PM
Thank you so much @Mridul M 🙂
... View more
10-27-2017
06:23 PM
@Mridul M Can you please update point number 2 in Ambari managed section? Current point - Add DataNucleus jars to the Spark Thrift Server classpath. Navigate to the “Advanced spark-hive-site-override” section and add: Modification - Add DataNucleus jars to the Spark Thrift Server classpath. Navigate to the “Custom spark-thrift-sparkconf” section and add: Thanks, Kuldeep
... View more
10-16-2017
10:02 PM
1 Kudo
Please follow below steps for running SparkR script via Oozie . 1. Install R packages on all the node managers yum -y install R R-devel libcurl-devel openssl-devel . 2. Keep your R script ready Here is the sample script library(SparkR)
sc <- sparkR.init(appName="SparkR-sample")
sqlContext <- sparkRSQL.init(sc)
localDF <- data.frame(name=c("ABC", "blah", "blah"), age=c(39, 32, 81))
df <- createDataFrame(sqlContext, localDF)
printSchema(df)
sparkR.stop() . 3. Create workflow.xml Here is the working example: <workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'>
<global>
<configuration>
<property>
<name>oozie.launcher.yarn.app.mapreduce.am.env</name>
<value>SPARK_HOME=/usr/hdp/2.5.3.0-37/spark</value>
</property>
<property>
<name>oozie.launcher.mapred.child.env</name>
<value>SPARK_HOME=/usr/hdp/2.5.3.0-37/spark</value>
</property>
</configuration>
</global>
<start to='spark-node' />
<action name='spark-node'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/>
</prepare>
<master>${master}</master>
<name>SparkR</name>
<jar>${nameNode}/user/${wf:user()}/spark.R</jar>
<spark-opts>--driver-memory 512m --conf spark.driver.extraJavaOptions=-Dhdp.version=2.5.3.0</spark-opts>
</spark>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Workflow failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name='end' />
</workflow-app> . 4. Make sure that you don't have sparkr.zip in workflow/lib directory or Oozie sharelib or in <file> tag in the workflow, or else it will cause conflicts. . Upload workflow to hdfs and run it. It should work. This has been successfully tested on HDP-2.5.X & HDP-2.6.X . Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! Reference - https://developer.ibm.com/hadoop/2017/06/30/scheduling-spark-job-written-pyspark-sparkr-yarn-oozie
... View more
Labels:
10-06-2017
10:31 PM
Please follow below steps to modify quicklinks for Oozie service in Ambari Note - This tutorial has been successfully tried and tested on Ambari 2.4.2.0 and Ambari 2.5.2.0 1. Please make sure that your /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/OOZIE/metainfo.xml looks like below. <?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<metainfo>
<schemaVersion>2.0</schemaVersion>
<services>
<service>
<name>OOZIE</name>
<extends>common-services/OOZIE/4.0.0.2.0</extends>
<quickLinksConfigurations>
<quickLinksConfiguration>
<fileName>quicklinks.json</fileName>
<default>true</default>
</quickLinksConfiguration>
</quickLinksConfigurations>
</service>
</services>
</metainfo>
. 2. Edit /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/OOZIE/quicklinks/quicklinks.json and modify "url" field to your loadbalancer's URL e.g. "url" : "https://<load-balancer-hostname:<port-number>/oozie>", . 3. Execute below command cp /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/OOZIE/quicklinks/quicklinks.json /var/lib/ambari-server/resources/common-services/OOZIE/4.2.0.2.3/quicklinks/quicklinks.json Note - Modify version numbers for Oozie if required. . 4. Restart Ambari server . 5. Try to access quicklinks for Oozie, it should point you to load balancer URL . . Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!
... View more
Labels:
07-17-2017
07:20 PM
@Julius Lerm - Not sure if there is any. Tagging one of our Ambari Engineer. @Alejandro Fernandez
... View more