Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Error when running a Spark job as a different zeppelin user with livy

avatar
Rising Star

On HDP 2.6, when trying to run the following paragraph as user2/user2 from a Zeppelin notebook (This is running in yarn-cluster mode):

%livy2.spark 
sc.version

It hangs for a bit, times out, and gives me the following java stack:

org.apache.zeppelin.livy.LivyException: Session 60 is finished, appId: null, log: [java.lang.Exception: No YARN application is found with tag livy-session-60-zahglq2y in 60 seconds. Please check your cluster status, it is may be very busy., com.cloudera.livy.utils.SparkYarnApp.com$cloudera$livy$utils$SparkYarnApp$$getAppIdFromTag(SparkYarnApp.scala:182) com.cloudera.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:248) com.cloudera.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:245) scala.Option.getOrElse(Option.scala:120) com.cloudera.livy.utils.SparkYarnApp$$anonfun$1.apply$mcV$sp(SparkYarnApp.scala:245) com.cloudera.livy.Utils$$anon$1.run(Utils.scala:95)]
	at org.apache.zeppelin.livy.BaseLivyInterprereter.createSession(BaseLivyInterprereter.java:209)
	at org.apache.zeppelin.livy.BaseLivyInterprereter.initLivySession(BaseLivyInterprereter.java:98)
	at org.apache.zeppelin.livy.BaseLivyInterprereter.open(BaseLivyInterprereter.java:80)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:482)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

From YARN logs, I just see this logged and nothing else from these unsuccessful attempts:

2017-05-30 17:22:44,115 INFO  resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(291)) - Allocated new applicationId: 32
2017-05-30 17:28:55,804 INFO  resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(291)) - Allocated new applicationId: 33

The same notebook works perfectly fine as user 'admin'. It's just when switching the user that it causes this issue. Any suggestion on what is wrong? And, there are plenty of resources available on YARN.

1 ACCEPTED SOLUTION

avatar
Rising Star
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
7 REPLIES 7

avatar
Guru

@zhoussen, As per livy logs, The spark application was not started correctly. In order to find out its root cause, please check the spark application logs.

Steps to follow:

1) Check the status of yarn cluster. ( List running applications)

2) Run livy paragraph as user2

3) Check if new application is launched in Yarn. If a new application is launched, check its status and application log for further debugging.

avatar
Rising Star
@yvora

Doesn't help. The YARN cluster is healthy, and doesn't even show this application in any failed state. The application log doesn't contain any more helpful message.

avatar
Guru

@zhoussen, so if application with "livy-session-60-zahglq2y" tag is alive and running fine. You need to update the livy app lookup timeout to be more than 60 secs. It seems that livy believes that yarn application was not started within 60 sec.

set livy.server.yarn.app-lookup-timeout to may be 300 sec.

avatar
Rising Star
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star
@yvora

Thanks for your answer. Made me look back at the entire flow.

avatar
Super Collaborator

It looks like the owner of /user/user1 is hdfs, but should be user1. Not sure how you create folder /user/user1, if you are admin, please change the owner, or ask your admin to do that.

avatar
Rising Star

@bkv

Check the YARN logs. It could be starving on YARN containers. You may need to adjust some YARN container settings there. As well, please post yours as a separate new issue rather than an answer to this one.