Support Questions

Find answers, ask questions, and share your expertise

Oozie Spark Integration giving FAILED for SLA error

avatar
Rising Star
<a href="/storage/attachments/5152-img-22062016-154852.png">img-22062016-154852.png</a>Hi Team,

We are using HDP 2.4.0.0-169 which is installed on Ubuntu 14.04.

There is a spark- mongo application which extracts the data from mongodb into HDFS. This application ins executed by using 3 spark modes which are local[*] , yarn-client and yarn-cluster. All these 3 spark-submit commands works from the command prompt.

We have written an oozie workflow to execute the job on hourly basis. While executing the oozie job, job is in RUNNING state but failed due to SLA error.

I have followed steps as mentioned below.

  1. I have changed the propeties for YARN in ambari
yarn.nodemanager.resource.memory-mb = 67436288 and increased upto 200999168
yarn.scheduler.minimum-allocation-mb = 1024 and increased to 2048
yarn.scheduler.maximum-allocation-mb = 8192 and decreased to 6144

2.I have also added below changes in custom spark-defaults :

spark.authenticate = false spark.driver.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.dynamicAllocation.executorIdleTimeout = 60 spark.dynamicAllocation.schedulerBacklogTimeout = 1 spark.executor.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.serializer = org.apache.spark.serializer.KryoSerializer spark.yarn.am.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.yarn.config.gatewayPath = /usr/hdp spark.yarn.config.replacementPath = {{HADOOP_COMMON_HOME}}/../../.. spark.yarn.jar = local:/usr/hdp/2.4.0.0-169/spark/lib/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

3. Added below property in advanced hadoop-env:

export SPARK_HOME=/usr/hdp/2.4.0.0-169/spark

4.I have added spark jars in the oozie lib folder with required permissions. Tried 777 and 755 permissions for all jars. But no luck:

spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
spark-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169-yarn-shuffle.jar

5.Also added the above mentioned jars and below mentioned jars in oozie share folder with required permission but no luck

/user/oozie/share/lib/lib_20160420150601/oozie

Code is defined as follows:

workflow.xml:

<?xml version="1.0"?>
<workflow-app name="sparkmongo" xmlns="uri:oozie:workflow:0.5">
    <start to="spark-mongo-ETL"/>
    <action name="spark-mongo-ETL">
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
             <master>${master}</master>
	    <mode>client</mode>
            <name>SparkMongoLoading</name>
            <class>com.snapfish.spark.etl.MongoETL</class>
            <jar>/user/hadoop/Sparkmongo/lib/original-etl-0.0.1-SNAPSHOT.jar</jar>
            <spark-opts> spark.driver.extraClassPath=hdfs://namenode1:8020/user/oozie/share/lib/lib_20150711021244/spark/* spark.yarn.historyServer.address=http://yarnNM:19888/ spark.eventLog.dir=hdfs://namenode1:8020/spark-history spark.eventLog.enabled=true </spark-opts>
			<arg>orders</arg>
        </spark>
        <ok to="End"/>
        <error to="killAction"/>
    </action>
        <kill name="killAction">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="End"/>
</workflow-app>

Tried removing below property. but no luck

spark.driver.extraClassPath=hdfs://namenode1:8020/user/oozie/share/lib/lib_20150711021244/spark/* spark.yarn.historyServer.address=http://yarnNM:19888/ spark.eventLog.dir=hdfs://namenode1:8020/spark-history spark.eventLog.enabled=true

Jars in lib folder :

mongo-hadoop-spark-1.5.2.jar
mongo-java-driver-3.2.2.jar
mongodb-driver-3.2.2.jar
original-etl-0.0.1-SNAPSHOT.jar
spark-1.6.0.2.4.0.0-169-yarn-shuffle.jar
spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
spark-core_2.10-1.6.0.jar
spark-mongodb_2.10-0.11.2.jar
Tried removing all jars except below two jars but no luck:

original-etl-0.0.1-SNAPSHOT.jar
spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar

job.properties :

nameNode=hdfs://namenode1:8020
jobTracker=yarnNM:8050
master=yarn-client
queueName=default
oozie.use.system.libpath=true
oozie.libpath=/user/oozie/share/lib/
user.name=hadoop
mapreduce.job.username=yarn
oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/

Also changed the below properties but no luck:

nameNode=hdfs://namenode1:8020
jobTracker=yarnNM:8050
master=yarn-client
queueName=default
oozie.use.system.libpath=true
user.name=hadoop
oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/
nameNode=hdfs://namenode1:8020
jobTracker=yarnNM:8050
master=yarn-client
queueName=default
oozie.use.system.libpath=true
oozie.libpath=/user/oozie/share/lib/
user.name=hadoop
oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/

After trying all the combinations i am facing some issue which is attached.capture1.jpgcapture2.jpg

1 ACCEPTED SOLUTION

avatar
Master Mentor

Spark action is not supported in HDP 2.4.0 it is available in 2.4.2 so before you can investigate issue with it, consider upgrading HDP.

View solution in original post

7 REPLIES 7

avatar
Rising Star

Also note that there is no Security implemented on the cluster

Please help me.. Thanks in Advance!!!

avatar
Rising Star

Please help me.. Thanks in Advance!!!

avatar
Master Guru

Think yarn is low on memory it needs extra data overhead for 3 spark jobs

Have you tried nifi

avatar
Rising Star

@Timithy I haven't tried Nifi. Please find the below details.

yarn.nodemanager.resource.memory-mb=200GB

yarn.scheduler.minimum-allocation-mb=2GB

yarn.scheduler.maximum-allocation-mb=6GB

yarn.scheduler.maximum-allocation-vcores=8

yarn.scheduler.minumum-allocation-vcores=1

yarn.nodemanager.resource.cpu-vcores=16

yarn.nodemanager.resource.percentage-physical-cpu-limit=80%

avatar
Contributor

@Vijay Kumar J

Before copying the spark assembly jar into share lib have you cleared other contents of the sharelib? If not, can you backup the current sharelib and then do following-

  1. Backup current sharelib - hdfs dfs -mv /user/oozie/share/lib/lib_<timestamp>/spark /user/oozie/share/lib/lib_<timestamp>/spark_old
  2. Create spark folder - hdfs dfs -mkdir /user/oozie/share/lib/lib_<timestamp>/spark
  3. Copy and Update all the required libraries ,files to the spark folder

Example:

  • hdfs dfs -put /hdp/2.4.0.0-169/spark/lib/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar /user/oozie/share/lib/lib_<timestamp>/spark
  • hdfs dfs -put /hdp/2.4.0.0-169/oozie/libserver/oozie-sharelib-spark-4.2.0.2.4.0.0-169.jar /user/oozie/share/lib/lib_<timestamp>/spark
  • hdfs dfs -put /etc/hadoop/conf/hdfs-site.xml /user/oozie/share/lib/lib_<timestamp>/spark
  • hdfs dfs -put /etc/hadoop/conf/core-site.xml /user/oozie/share/lib/lib_<timestamp>/spark
  • hdfs dfs -put /etc/hadoop/conf/mapred-site.xml /user/oozie/share/lib/lib_<timestamp>/spark
  • hdfs dfs -put /etc/hadoop/conf/yarn-site.xml /user/oozie/share/lib/lib_<timestamp>/spark oozie admin -oozie http://<oozie_server>:11000/oozie -sharelibupdate

avatar
Rising Star

Hi Trupti,

Sorry for delay & thanks for the response.

I have tried the above mentioned process, But still facing the same error.

avatar
Master Mentor

Spark action is not supported in HDP 2.4.0 it is available in 2.4.2 so before you can investigate issue with it, consider upgrading HDP.