Support Questions

Find answers, ask questions, and share your expertise

cannot start spark session

avatar
Explorer

Hello everyone:

When I am trying to start a scala session it gets stuck on 'Scala session (Base Image v6) starting...'

But I can reach the terminal and /tmp/spark-driver.log says

WARN ui.JettyUtils: GET /jobs/ failed: java.util.NoSuchElementException
java.util.NoSuchElementException

 

Additionly, when I try to run pyspark program, it get stucked  and spark ui tells the same mistake.

Do anyone know what happened?

Thanks a lot!

5 REPLIES 5

avatar
Expert Contributor

Hello @ursash

 

Thanks for posting your query.

 

Could you please share the complete exception trace from the logs which you see

Thanks,
Satz

avatar
Explorer

Hi, I didn't find how to upload an attachment. So  I copied the error message here.

I can run spark2 demo on other nodes in the cluster. 
But when I run spark2 demo inside the docker terminal, it get stucked. And I can read and write hdfs file in the docker terminal.

 

18/12/17 09:57:13 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm138 after 1 fail over attempts. Trying to fail over immediately.
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM; Host Details : local host is: "cmzr7xobj64n5n12/100.66.0.11"; destination host is: "master02.cdh.cdhtest.com":8032;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy21.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy22.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:483)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:159)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:159)
at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:61)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:158)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at org.apache.toree.kernel.api.Kernel.createSparkContext(Kernel.scala:354)
at org.apache.toree.kernel.api.Kernel.createSparkContext(Kernel.scala:373)
at org.apache.toree.boot.layer.StandardComponentInitialization$class.initializeSparkContext(ComponentInitialization.scala:103)
at org.apache.toree.Main$$anon$1.initializeSparkContext(Main.scala:35)
at org.apache.toree.boot.layer.StandardComponentInitialization$class.initializeComponents(ComponentInitialization.scala:86)
at org.apache.toree.Main$$anon$1.initializeComponents(Main.scala:35)
at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:100)
at org.apache.toree.Main$.delayedEndpoint$org$apache$toree$Main$1(Main.scala:40)
at org.apache.toree.Main$delayedInit$body.apply(Main.scala:24)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at org.apache.toree.Main$.main(Main.scala:24)
at org.apache.toree.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:681)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:769)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 53 more
Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:335)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:231)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:159)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:594)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:396)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:761)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:757)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
... 56 more
18/12/17 09:57:13 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm88
18/12/17 09:57:13 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm88 after 2 fail over attempts. Trying to fail over after sleeping for 549ms.
java.net.ConnectException: Call From cmzr7xobj64n5n12/100.66.0.11 to master01.cdh.cdhtest.com:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy21.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy22.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:483)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:159)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:159)
at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:61)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:158)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at org.apache.toree.kernel.api.Kernel.createSparkContext(Kernel.scala:354)
at org.apache.toree.kernel.api.Kernel.createSparkContext(Kernel.scala:373)
at org.apache.toree.boot.layer.StandardComponentInitialization$class.initializeSparkContext(ComponentInitialization.scala:103)
at org.apache.toree.Main$$anon$1.initializeSparkContext(Main.scala:35)
at org.apache.toree.boot.layer.StandardComponentInitialization$class.initializeComponents(ComponentInitialization.scala:86)
at org.apache.toree.Main$$anon$1.initializeComponents(Main.scala:35)
at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:100)
at org.apache.toree.Main$.delayedEndpoint$org$apache$toree$Main$1(Main.scala:40)
at org.apache.toree.Main$delayedInit$body.apply(Main.scala:24)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at org.apache.toree.Main$.main(Main.scala:24)
at org.apache.toree.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 53 more
18/12/17 09:57:13 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm138
18/12/17 09:57:13 WARN ipc.Client: Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM
18/12/17 09:57:13 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs@CDHTEST.COM (auth:KERBEROS) cause:java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM
18/12/17 09:57:13 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm138 after 3 fail over attempts. Trying to fail over immediately.
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: yarn/master02.cdh.cdhtest.com@CDHTEST.COM, expecting: yarn/10.1.1.3@CDHTEST.COM; Host Details : local host is: "cmzr7xobj64n5n12/100.66.0.11"; destination host is: "master02.cdh.cdhtest.com":8032;

avatar
Explorer

hi, satz

I found the reason just now.

It's because we use openstack as our platform  and the dns server ip starts with 100, like 100.x.x.x

I think it may conflict with cdsw k8s.

I changed the dns to local nodes using dnsmasq and everthing is fine now. 

avatar
Expert Contributor

Hello @ursash

 

Glad to know your issue got relsoved !

 

Yes, the API calls "GET /jobs/ failed" might be suffered because of your conflict

 

Thanks,
Satz

avatar
New Contributor

Hi,

 

How did you change the dns to local nodes using dnsmasq?

 

Could you please provide me the step by step. I have the same issue on my machine.

 

Thanks in advance!