Support Questions
Find answers, ask questions, and share your expertise

Cannot read from HDFS in HDP Azure Sandbox

New Contributor

I've created a new VM for HDP2.5 Sandbox in Azure. Now I'm trying to read files from HDFS via a Java program from my local PC using class org.apache.hadoop.hdfs.DFSClient. I can check the existence of the file and create an InputStream. But when I want to read from it this Exception occurs:

>>>> 16:37:55.965 [pool-4-thread-1] WARN org.apache.hadoop.hdfs.BlockReaderFactory - I/O error constructing remote block reader. java.net.ConnectException: Connection timed out: no further information ..

16:37:55.975 [pool-4-thread-1] WARN org.apache.hadoop.hdfs.DFSClient - Failed to connect to /172.17.0.2:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information java.net.ConnectException: Connection timed out: no further information ..

>>>

In Azure for the VM I've created Inbound/Outbound Rules on all ports. Also added sandbox.hortonworks.com to etc\hosts on my local machine as described. I've already checked out this similar thread: https://community.hortonworks.com/questions/49728/errors-in-gethdfsputhdfs-using-hdf-running-on-loca...

Especially I've tried to add below property to hdfs-site.xml: <property><name>dfs.client.use.datanode.hostname</name><value>true</value></property> and also to replace in same file all the 0.0.0.0:* entries for the datanodes with sandbox.hortonworks.com.*. But without success.

Any idea?

Kind regards, Andreas

1 REPLY 1

New Contributor

The problem can also be reproduced with the Java program from here:

https://github.com/saagie/example-java-read-and-write-from-hdfs/blob/master/src/main/java/io/saagie/...

The hello.csv file is created but it's empty. The Exception I get is:

Jun 01, 2017 1:53:41 PM Test main INFORMATION: Begin Write file into hdfs Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs/example/hdfs/hello.csv could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1641) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3198)

...

Anyone else has had this issue?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.