Created on 01-29-2016 11:07 AM - edited 08-19-2019 03:48 AM
Hi,
I have setup Spark 1.5 through Ambari on HDP2.3.4 Ambari reports all the services on HDP are running fine including Spark. I have installed Zeppelin and trying to use notebook as documented in Zeppelin tech preview notebook.
However, when I run sc.version, I am seeing an error
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:119) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) at org.apache.spark.SparkContext.<init>(SparkContext.scala:497) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Let me know what is causing this. Does it relate to any configuration issue? Attached is the screenshot for the interpreter configuration.
Created 02-02-2016 08:02 AM
@Artem Ervits This is still an outstanding issue with 1.5.2. No workaround has been found. However, I have now upgraded to Spark 1.6 integrated with Zeppelin, which works fine.
Created 01-29-2016 01:33 PM
Spark has been installed from Ambari. However, Zeppelin is being installed from the rpm package. Tech Preview 0.6.0 version
Created 01-29-2016 01:37 PM
@vbhoomireddy You can follow the above link that I shared for the port change. Change the port to 9995
Created 01-29-2016 01:40 PM
zeppelin-env.sh already contains export ZEPPELIN_PORT=9995. Container creation is failing
Created 01-29-2016 01:45 PM
@vbhoomireddy Just to verify
whats the output of netstat -anp | grep 9995
Created on 01-29-2016 01:54 PM - edited 08-19-2019 03:48 AM
tcp 0 0 0.0.0.0:9995 0.0.0.0:* LISTEN 26340/java
tcp 0 0 10.87.139.168:9995 10.25.35.165:55024 ESTABLISHED 26340/java
Please see the below attempt log. could it be because of this line?
Unknown/unsupported param List(--num-executors, 2)
Created 01-29-2016 01:57 PM
@vbhoomireddy New issue .. https://issues.apache.org/jira/browse/SPARK-5005
Thread is getting long 🙂
Created 01-29-2016 02:26 PM
@Neeraj Sabharwal I was thinking about the same 😉
it works fine when I run sparkpi directly in the cli, without going to zeppelin.
./bin/spark-submit --class org.apache.spark.examples.SparkPi--master yarn-client --num-executors 3--driver-memory 512m--executor-memory 512m--executor-cores 1 lib/spark-examples*.jar 10
Also, the issue has been closed with a status "Cannot Reproduce" So not sure on what the options I would have now to get zeppelin going. Any ideas?
Created 01-29-2016 02:40 PM
@vbhoomireddy what do you mean by "Cannot reproduce" ? Someone closed this thread?
The only suggestion I have is to recheck your setting, delete the interpreter and recreate it by following the blog that I shared.
Created 01-29-2016 12:46 PM
Looking further at the below zeppelin logs, my understanding is that when Zeppelin tried to create a spark context, Spark tries to use port 4040 for launching the spark context's Web UI. However as the port is already in use on the machine, it couldnt launch the spark context.
WARN [2016-01-29 12:34:27,375] ({pool-2-thread-3} AbstractLifeCycle.java[setFailed]:204) - FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:433) at sun.nio.ch.Net.bind(Net.java:425) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.spark-project.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) at org.spark-project.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316) at org.spark-project.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265) at org.spark-project.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.spark-project.jetty.server.Server.doStart(Server.java:293) at org.spark-project.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$connect$1(JettyUtils.scala:228) at org.apache.spark.ui.JettyUtils$anonfun$2.apply(JettyUtils.scala:238) at org.apache.spark.ui.JettyUtils$anonfun$2.apply(JettyUtils.scala:238) at org.apache.spark.util.Utils$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:238) at org.apache.spark.ui.WebUI.bind(WebUI.scala:117) at org.apache.spark.SparkContext$anonfun$13.apply(SparkContext.scala:448) at org.apache.spark.SparkContext$anonfun$13.apply(SparkContext.scala:448) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.<init>(SparkContext.scala:448) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
Any idea how to change for Zeppelin to use another port instead of 4040. I belive Spark has the mechanism to try 4041, 4042 etc when two two shells are running on the machine and both compete for the same port. However, does Zeppelin does the same?
Created 02-02-2016 02:25 AM
@vbhoomireddy can you accept the best answer to close this thread or provide your own solution?