Created on 03-23-2016 11:26 AM - edited 09-16-2022 03:10 AM
Hey guys,
I have the following scenario:
I have a machine running CDH 5.5, everything is up and running. When I start spark-shell inside this machine, everything works perfectly.
In my local machine, I downloaded Spark 1.5.0 compiled for Hadoop 2.4 and up. What I did was:
Then I get this error:
Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/03/23 15:21:43 INFO SparkContext: Running Spark version 1.5.0 16/03/23 15:21:44 INFO SecurityManager: Changing view acls to: cloudera 16/03/23 15:21:44 INFO SecurityManager: Changing modify acls to: cloudera 16/03/23 15:21:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(cloudera); users with modify permissions: Set(cloudera) 16/03/23 15:21:44 INFO Slf4jLogger: Slf4jLogger started 16/03/23 15:21:45 INFO Remoting: Starting remoting 16/03/23 15:21:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.1.1.36:51463] 16/03/23 15:21:45 INFO Utils: Successfully started service 'sparkDriver' on port 51463. 16/03/23 15:21:45 INFO SparkEnv: Registering MapOutputTracker 16/03/23 15:21:45 INFO SparkEnv: Registering BlockManagerMaster 16/03/23 15:21:45 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e80c02a3-8c96-4a9c-8ca5-df47894e74db 16/03/23 15:21:45 INFO MemoryStore: MemoryStore started with capacity 530.3 MB 16/03/23 15:21:45 INFO HttpFileServer: HTTP File server directory is /tmp/spark-17ec3a7c-5e8a-47b6-9b81-8a9804348adf/httpd-49f244bd-b0de-4530-8c46-9375c267b6eb 16/03/23 15:21:45 INFO HttpServer: Starting HTTP Server 16/03/23 15:21:45 INFO Utils: Successfully started service 'HTTP file server' on port 57142. 16/03/23 15:21:45 INFO SparkEnv: Registering OutputCommitCoordinator 16/03/23 15:21:45 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/03/23 15:21:45 INFO SparkUI: Started SparkUI at http://10.1.1.36:4040 16/03/23 15:21:45 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. 16/03/23 15:21:45 INFO RMProxy: Connecting to ResourceManager at /10.1.8.109:8032 16/03/23 15:21:46 INFO Client: Requesting a new application from cluster with 1 NodeManagers 16/03/23 15:21:46 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2816 MB per container) 16/03/23 15:21:46 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 16/03/23 15:21:46 INFO Client: Setting up container launch context for our AM 16/03/23 15:21:46 INFO Client: Setting up the launch environment for our AM container 16/03/23 15:21:46 INFO Client: Preparing resources for our AM container 16/03/23 15:21:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/03/23 15:21:46 INFO Client: Uploading resource file:/opt/icarotech/spark150/lib/spark-assembly-1.5.0-hadoop2.4.0.jar -> hdfs://10.1.8.109:8020/user/cloudera/.sparkStaging/application_1458754318119_0005/spark-assembly-1.5.0-hadoop2.4.0.jar 16/03/23 15:21:46 INFO DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1516) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1272) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) 16/03/23 15:21:46 INFO DFSClient: Abandoning BP-2143973830-127.0.0.1-1458655339361:blk_1073743845_3021 16/03/23 15:21:46 INFO DFSClient: Excluding datanode 127.0.0.1:50010 16/03/23 15:21:46 WARN DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/.sparkStaging/application_1458754318119_0005/spark-assembly-1.5.0-hadoop2.4.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1557) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:676) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) 16/03/23 15:21:46 INFO Client: Deleting staging directory .sparkStaging/application_1458754318119_0005 16/03/23 15:21:46 ERROR SparkContext: Error initializing SparkContext. org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/.sparkStaging/application_1458754318119_0005/spark-assembly-1.5.0-hadoop2.4.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1557) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:676) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) 16/03/23 15:21:46 INFO SparkUI: Stopped Spark web UI at http://10.1.1.36:4040 16/03/23 15:21:46 INFO DAGScheduler: Stopping DAGScheduler 16/03/23 15:21:46 INFO YarnClientSchedulerBackend: Stopped 16/03/23 15:21:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/03/23 15:21:46 ERROR Utils: Uncaught exception in thread Thread-2 java.lang.NullPointerException at org.apache.spark.network.netty.NettyBlockTransferService.close(NettyBlockTransferService.scala:152) at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1228) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:100) at org.apache.spark.SparkContext$$anonfun$stop$12.apply$mcV$sp(SparkContext.scala:1740) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1185) at org.apache.spark.SparkContext.stop(SparkContext.scala:1739) at org.apache.spark.SparkContext.<init>(SparkContext.scala:584) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) 16/03/23 15:21:46 INFO SparkContext: Successfully stopped SparkContext Traceback (most recent call last): File "/opt/icarotech/spark150/python/pyspark/shell.py", line 43, in <module> sc = SparkContext(pyFiles=add_files) File "/opt/icarotech/spark150/python/pyspark/context.py", line 113, in __init__ conf, jsc, profiler_cls) File "/opt/icarotech/spark150/python/pyspark/context.py", line 170, in _do_init self._jsc = jsc or self._initialize_context(self._conf._jconf) File "/opt/icarotech/spark150/python/pyspark/context.py", line 224, in _initialize_context return self._jvm.JavaSparkContext(jconf) File "/opt/icarotech/spark150/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__ File "/opt/icarotech/spark150/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/.sparkStaging/application_1458754318119_0005/spark-assembly-1.5.0-hadoop2.4.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1557) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:676) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) 16/03/23 15:21:48 INFO DiskBlockManager: Shutdown hook called 16/03/23 15:21:48 INFO ShutdownHookManager: Shutdown hook called 16/03/23 15:21:48 INFO ShutdownHookManager: Deleting directory /tmp/spark-17ec3a7c-5e8a-47b6-9b81-8a9804348adf
The same error happens if I try to use spark-shell.
I thought that could be a network problem, because of
16/03/23 15:21:46 INFO DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused
However, I don't have firewalls in any of the machines and both can see each other in the network.
Any idea of a what could be going wrong?
Thanks in advance 🙂
Created on 03-24-2016 11:08 PM - edited 03-24-2016 11:12 PM
Hi,
Did you set hdfs-site.xml in HADOOP_CONF_DIR?
Your log says datanode exists at localhost.
16/03/23 15:21:46 INFO DFSClient: Excluding datanode 127.0.0.1:50010
CDH 5.5' hadoop roughly corresponds to hadoop 2.6,
so building spark with hadoop 2.6 option might be good.
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_vd_cdh_package_tarball.html
Created on 03-24-2016 11:08 PM - edited 03-24-2016 11:12 PM
Hi,
Did you set hdfs-site.xml in HADOOP_CONF_DIR?
Your log says datanode exists at localhost.
16/03/23 15:21:46 INFO DFSClient: Excluding datanode 127.0.0.1:50010
CDH 5.5' hadoop roughly corresponds to hadoop 2.6,
so building spark with hadoop 2.6 option might be good.
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_vd_cdh_package_tarball.html
Created 04-08-2016 07:14 AM
I wasn't getting all configuration files needed. I could download them in Cloudera Manager, Hive > Actions > Download Client Configuration. Unzip them at HADOOP_CONF_DIR