Created 06-22-2016 09:46 AM
<a href="/storage/attachments/5152-img-22062016-154852.png">img-22062016-154852.png</a>Hi Team,
We are using HDP 2.4.0.0-169 which is installed on Ubuntu 14.04.
There is a spark- mongo application which extracts the data from mongodb into HDFS. This application ins executed by using 3 spark modes which are local[*] , yarn-client and yarn-cluster. All these 3 spark-submit commands works from the command prompt.
We have written an oozie workflow to execute the job on hourly basis. While executing the oozie job, job is in RUNNING state but failed due to SLA error.
I have followed steps as mentioned below.
yarn.nodemanager.resource.memory-mb = 67436288 and increased upto 200999168 yarn.scheduler.minimum-allocation-mb = 1024 and increased to 2048 yarn.scheduler.maximum-allocation-mb = 8192 and decreased to 6144
2.I have also added below changes in custom spark-defaults :
spark.authenticate = false spark.driver.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.dynamicAllocation.executorIdleTimeout = 60 spark.dynamicAllocation.schedulerBacklogTimeout = 1 spark.executor.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.serializer = org.apache.spark.serializer.KryoSerializer spark.yarn.am.extraLibraryPath = /usr/hdp/2.4.0.0-169/hadoop/lib/native spark.yarn.config.gatewayPath = /usr/hdp spark.yarn.config.replacementPath = {{HADOOP_COMMON_HOME}}/../../.. spark.yarn.jar = local:/usr/hdp/2.4.0.0-169/spark/lib/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
3. Added below property in advanced hadoop-env:
export SPARK_HOME=/usr/hdp/2.4.0.0-169/spark
4.I have added spark jars in the oozie lib folder with required permissions. Tried 777 and 755 permissions for all jars. But no luck:
spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar spark-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169-yarn-shuffle.jar
5.Also added the above mentioned jars and below mentioned jars in oozie share folder with required permission but no luck
/user/oozie/share/lib/lib_20160420150601/oozie
Code is defined as follows:
workflow.xml:
<?xml version="1.0"?> <workflow-app name="sparkmongo" xmlns="uri:oozie:workflow:0.5"> <start to="spark-mongo-ETL"/> <action name="spark-mongo-ETL"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <master>${master}</master> <mode>client</mode> <name>SparkMongoLoading</name> <class>com.snapfish.spark.etl.MongoETL</class> <jar>/user/hadoop/Sparkmongo/lib/original-etl-0.0.1-SNAPSHOT.jar</jar> <spark-opts> spark.driver.extraClassPath=hdfs://namenode1:8020/user/oozie/share/lib/lib_20150711021244/spark/* spark.yarn.historyServer.address=http://yarnNM:19888/ spark.eventLog.dir=hdfs://namenode1:8020/spark-history spark.eventLog.enabled=true </spark-opts> <arg>orders</arg> </spark> <ok to="End"/> <error to="killAction"/> </action> <kill name="killAction"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="End"/> </workflow-app>
Tried removing below property. but no luck spark.driver.extraClassPath=hdfs://namenode1:8020/user/oozie/share/lib/lib_20150711021244/spark/* spark.yarn.historyServer.address=http://yarnNM:19888/ spark.eventLog.dir=hdfs://namenode1:8020/spark-history spark.eventLog.enabled=true
Jars in lib folder :
mongo-hadoop-spark-1.5.2.jar mongo-java-driver-3.2.2.jar mongodb-driver-3.2.2.jar original-etl-0.0.1-SNAPSHOT.jar spark-1.6.0.2.4.0.0-169-yarn-shuffle.jar spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar spark-core_2.10-1.6.0.jar spark-mongodb_2.10-0.11.2.jar
Tried removing all jars except below two jars but no luck: original-etl-0.0.1-SNAPSHOT.jar spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar
job.properties :
nameNode=hdfs://namenode1:8020 jobTracker=yarnNM:8050 master=yarn-client queueName=default oozie.use.system.libpath=true oozie.libpath=/user/oozie/share/lib/ user.name=hadoop mapreduce.job.username=yarn oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/
Also changed the below properties but no luck:
nameNode=hdfs://namenode1:8020 jobTracker=yarnNM:8050 master=yarn-client queueName=default oozie.use.system.libpath=true user.name=hadoop oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/
nameNode=hdfs://namenode1:8020 jobTracker=yarnNM:8050 master=yarn-client queueName=default oozie.use.system.libpath=true oozie.libpath=/user/oozie/share/lib/ user.name=hadoop oozie.wf.application.path=${nameNode}/user/hadoop/Sparkmongo/
After trying all the combinations i am facing some issue which is attached.capture1.jpgcapture2.jpg
Created 07-08-2016 11:31 AM
Spark action is not supported in HDP 2.4.0 it is available in 2.4.2 so before you can investigate issue with it, consider upgrading HDP.
Created 06-22-2016 09:48 AM
Also note that there is no Security implemented on the cluster
Please help me.. Thanks in Advance!!!
Created 06-22-2016 09:48 AM
Please help me.. Thanks in Advance!!!
Created 06-23-2016 05:59 PM
Think yarn is low on memory it needs extra data overhead for 3 spark jobs
Have you tried nifi
Created 06-24-2016 03:30 AM
@Timithy I haven't tried Nifi. Please find the below details.
yarn.nodemanager.resource.memory-mb=200GB
yarn.scheduler.minimum-allocation-mb=2GB
yarn.scheduler.maximum-allocation-mb=6GB
yarn.scheduler.maximum-allocation-vcores=8
yarn.scheduler.minumum-allocation-vcores=1
yarn.nodemanager.resource.cpu-vcores=16
yarn.nodemanager.resource.percentage-physical-cpu-limit=80%
Created 06-24-2016 04:41 PM
Before copying the spark assembly jar into share lib have you cleared other contents of the sharelib? If not, can you backup the current sharelib and then do following-
Example:
Created 07-05-2016 07:04 PM
Hi Trupti,
Sorry for delay & thanks for the response.
I have tried the above mentioned process, But still facing the same error.
Created 07-08-2016 11:31 AM
Spark action is not supported in HDP 2.4.0 it is available in 2.4.2 so before you can investigate issue with it, consider upgrading HDP.