Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

access sandbox from spark - HDFS File access Error - Failed to connect to /

Expert Contributor

So I'm using the sandbox and trying from my machine to connect to HDFS.

val shakespear = spark.sparkContext.textFile("hdfs://").map(println)

The error shows I think that I can't access the (docker) Datanode from my local machine:

18/01/22 15:13:13 WARN BlockReaderFactory: I/O error constructing remote block reader. 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/]
	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(
	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(
	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(
	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(

I already added 50010 to my docker ports. (Following the instructions here thanks @Roger Young great walk through)

What else do I need to do to be able to access the data? (I'm getting the feeling that I need to add network routing to address the docker instance directly... In ambari it does report that it's address is

Should I be setting the FQDN of the docker instance?


What version of the Sandbox are you using? Are you trying to use a particular tutorial?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.