Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

[CDH 5.5 VirtualBox] unable to connect to Spark Master/Worker

avatar
New Contributor

Hi *,

im using the newest CDH 5.5 VirtualBox Image, so far i only installed scala, sbt, netbeans and dropbox.

 

I encounter a problem running a spark application (see error below). In the Cloudera Manager every service, including spark is on green. But however if I click on the Spark bookmarks I am unable to connect to Spark Master/Worker, the History Server works and also every other bookmark.

 

The spark application i run was build in some previous CDH 5.x release and I only changed the master to local (I have no access to the cluster at my university yet) I would be happy about some help.

 

Best Regards,

Tobias

 

15/11/27 09:30:43 INFO SparkContext: Running Spark version 1.5.0-cdh5.5.0
15/11/27 09:30:43 WARN Utils: Your hostname, quickstart.cloudera resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface eth1)
15/11/27 09:30:43 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/11/27 09:30:44 INFO SecurityManager: Changing view acls to: cloudera
15/11/27 09:30:44 INFO SecurityManager: Changing modify acls to: cloudera
15/11/27 09:30:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cloudera); users with modify permissions: Set(cloudera)
15/11/27 09:30:45 INFO Slf4jLogger: Slf4jLogger started
15/11/27 09:30:45 INFO Remoting: Starting remoting
15/11/27 09:30:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.0.2.15:51812]
15/11/27 09:30:45 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@10.0.2.15:51812]
15/11/27 09:30:45 INFO Utils: Successfully started service 'sparkDriver' on port 51812.
15/11/27 09:30:45 INFO SparkEnv: Registering MapOutputTracker
15/11/27 09:30:45 INFO SparkEnv: Registering BlockManagerMaster
15/11/27 09:30:45 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e2a8a00b-9bac-4564-a75d-cf2bd864e0be
15/11/27 09:30:45 INFO MemoryStore: MemoryStore started with capacity 883.6 MB
15/11/27 09:30:45 INFO HttpFileServer: HTTP File server directory is /tmp/spark-3e2ba8c0-103d-416a-9963-427825993dfc/httpd-66ccf5bb-e5e7-46ad-9541-5f6438a753cc
15/11/27 09:30:45 INFO HttpServer: Starting HTTP Server
15/11/27 09:30:45 INFO Server: jetty-8.y.z-SNAPSHOT
15/11/27 09:30:46 INFO AbstractConnector: Started SocketConnector@0.0.0.0:36447
15/11/27 09:30:46 INFO Utils: Successfully started service 'HTTP file server' on port 36447.
15/11/27 09:30:46 INFO SparkEnv: Registering OutputCommitCoordinator
15/11/27 09:30:46 INFO Server: jetty-8.y.z-SNAPSHOT
15/11/27 09:30:46 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/11/27 09:30:46 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/11/27 09:30:46 INFO SparkUI: Started SparkUI at http://10.0.2.15:4040
15/11/27 09:30:46 INFO SparkContext: Added JAR file:/home/cloudera/workspace/./S2RDF_DataSetCreator/target/scala-2.10/datasetcreator_2.10-1.0.jar at http://10.0.2.15:36447/jars/datasetcreator_2.10-1.0.jar with timestamp 1448613046392
15/11/27 09:30:46 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/11/27 09:30:46 INFO Executor: Starting executor ID driver on host localhost
15/11/27 09:30:46 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34264.
15/11/27 09:30:46 INFO NettyBlockTransferService: Server created on 34264
15/11/27 09:30:46 INFO BlockManager: external shuffle service port = 7337
15/11/27 09:30:46 INFO BlockManagerMaster: Trying to register BlockManager
15/11/27 09:30:46 INFO BlockManagerMasterEndpoint: Registering block manager localhost:34264 with 883.6 MB RAM, BlockManagerId(driver, localhost, 34264)
15/11/27 09:30:46 INFO BlockManagerMaster: Registered BlockManager
15/11/27 09:30:47 ERROR SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:145)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6609)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6591)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6543)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2756)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2674)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2559)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:592)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
	at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1872)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1737)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1662)
	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:404)
	at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)
	at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:121)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
	at dataCreator.Settings$.loadSparkContext(Settings.scala:69)
	at dataCreator.Settings$.<init>(Settings.scala:17)
	at dataCreator.Settings$.<clinit>(Settings.scala)
	at runDriver$.main(runDriver.scala:12)
	at runDriver.main(runDriver.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
	at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:145)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6609)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6591)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6543)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2756)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2674)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2559)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:592)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

	at org.apache.hadoop.ipc.Client.call(Client.java:1472)
	at org.apache.hadoop.ipc.Client.call(Client.java:1403)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at com.sun.proxy.$Proxy18.create(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy19.create(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1867)
	... 27 more
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
15/11/27 09:30:47 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
15/11/27 09:30:48 INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
15/11/27 09:30:48 INFO DAGScheduler: Stopping DAGScheduler
15/11/27 09:30:48 WARN QueuedThreadPool: 2 threads could not be stopped
15/11/27 09:30:48 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/11/27 09:30:48 INFO MemoryStore: MemoryStore cleared
15/11/27 09:30:48 INFO BlockManager: BlockManager stopped
15/11/27 09:30:48 INFO BlockManagerMaster: BlockManagerMaster stopped
15/11/27 09:30:48 INFO SparkContext: Successfully stopped SparkContext

 

2 ACCEPTED SOLUTIONS

avatar
Guru
The bookmark needs to be removed, I'm afraid: as of CDH 5.5, only
Spark-on-YARN is supported, so there is no Spark Master to run. When
running the spark shell, for instance, one specify 'yarn-client' as the
master.

View solution in original post

avatar
New Contributor

You will have problems starting spark-shell with any master because there are permissioning issues with the hdfs folders that spark uses.

 

To fix it do the following:

 

sudo -u hdfs hadoop fs -chmod 777 /user/spark
sudo -u spark hadoop fs -chmod 777 /user/spark/applicationHistory

 

That should do it 🙂

View solution in original post

3 REPLIES 3

avatar
Guru
The bookmark needs to be removed, I'm afraid: as of CDH 5.5, only
Spark-on-YARN is supported, so there is no Spark Master to run. When
running the spark shell, for instance, one specify 'yarn-client' as the
master.

avatar
New Contributor

You will have problems starting spark-shell with any master because there are permissioning issues with the hdfs folders that spark uses.

 

To fix it do the following:

 

sudo -u hdfs hadoop fs -chmod 777 /user/spark
sudo -u spark hadoop fs -chmod 777 /user/spark/applicationHistory

 

That should do it 🙂

avatar
Contributor

For me it was solved after login using hdfs user. You can typically then run the spark-shell command without errors