Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark 2.2 on yarn app master container fails on the one node while good on all the rest ones

Spark 2.2 on yarn app master container fails on the one node while good on all the rest ones

New Contributor

Hi I am struggling with a curious error

 

I have a yarn cluster of 3 working nodes and when my spark2.2 on yarn job attempts to launch application master container  on the particular one (second) node it fails while on the other 2 nodes  application master  starts  fine and jobs finish successfully 

 

Here is application log

 

 

Log Type: stderr

Log Upload Time: Sat Apr 28 17:29:37 +0300 2018

Log Length: 1197

18/04/28 17:29:35 INFO util.SignalUtils: Registered signal handler for TERM
18/04/28 17:29:35 INFO util.SignalUtils: Registered signal handler for HUP
18/04/28 17:29:35 INFO util.SignalUtils: Registered signal handler for INT
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.SparkConf.get(Lorg/apache/spark/internal/config/ConfigEntry;)Ljava/lang/Object;
	at org.apache.spark.deploy.yarn.ApplicationMaster.<init>(ApplicationMaster.scala:71)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:773)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:772)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

Log Type: stdout

Log Upload Time: Sat Apr 28 17:29:37 +0300 2018

Log Length: 0

 

 

 

 

and here is resourse manager log 

please assist

 

USER=hive2	OPERATION=Application Finished - Failed	TARGET=RMAppManager	RESULT=FAILURE	DESCRIPTION=App failed with state: FAILED	PERMISSIONS=Application application_1524925490307_0003 failed 1 times due to AM Container for appattempt_1524925490307_0003_000001 exited with  exitCode: 1
For more detailed output, check application tracking page:http://bigdata-01.vm-p.rdtex.ru:8088/proxy/application_1524925490307_0003/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1524925490307_0003_01_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:601)
	at org.apache.hadoop.util.Shell.run(Shell.java:504)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.	APPID=application_1524925490307_0003

 

This is hive 2.3.3 on spark 2.2 engine jobs 

all the 3 working nodes are complete twins (cloudera express bundle 5.10) 

 

Any ideas please?