Support Questions

Find answers, ask questions, and share your expertise

replicated to 0 nodes instead of minreplication HDFS

avatar
Rising Star

Hi,

I am facing the issue while uploading files from local file system to HDFS using java api method as fs.copyFromLocalFile(new Path("D://50GBTtest"), new Path("/50GBTest")); after around 1000 files uploaded, I got an exception saying as follows,

Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Abandoning BP-1106394772-192.168.11.45-1476099033336:blk_1073743844_3020
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Excluding datanode 192.168.11.45:50010
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /50GBTest/50GBTtest/disk10544.doc could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1640)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3161)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3085)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:830)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)

I have checked that 1 datanode is running and has lot of free space as well. I have kept another datanode down due to some scenario test.

Can anybody have the ideas ? or faced this issue in past ?

8 REPLIES 8

avatar
Rising Star

adding my datanode health status as comment,

[hdfs@hadoop3ind1 root]$ hdfs dfsadmin -report
Configured Capacity: 315522809856 (293.85 GB)
Present Capacity: 288754439992 (268.92 GB)
DFS Remaining: 285925896192 (266.29 GB)
DFS Used: 2828543800 (2.63 GB)
DFS Used%: 0.98%
Under replicated blocks: 1475
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.11.47:50010 (hadoop3ind1.india)
Hostname: hadoop3ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1563369472 (1.46 GB)
Non DFS Used: 11485380608 (10.70 GB)
DFS Remaining: 144712654848 (134.77 GB)
DFS Used%: 0.99%
DFS Remaining%: 91.73%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:06 IST 2016
Name: 192.168.11.45:50010 (hadoop1ind1.india)
Hostname: hadoop1ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1265174328 (1.18 GB)
Non DFS Used: 15282989256 (14.23 GB)
DFS Remaining: 141213241344 (131.52 GB)
DFS Used%: 0.80%
DFS Remaining%: 89.51%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:05 IST 2016

avatar
Rising Star

@David Streever : Can you please have a look at this ? I have also attached my data node details in above comment

I am stuck here. No more files are being uploaded to any of the data node. Also I have observed that if both datanode is up and running than process doesn't throw an exception, but if one of them goes down, than after some time this exception is thrown, how to resolve this ?

Is it like, if one datanode is not communicating with other for few minutes,then this whole upload process can be stopped after that time.

avatar
Super Guru
@Viraj Vekaria

Can you pass the screenshot if Namenode UI ?

Also can you check if there is value in this file or property from ambari for "dfs.hosts.exclude"

cat /etc/hadoop/conf/dfs.hosts.exclude

avatar
Rising Star

@Sagar Shimpi : Please have a look at the screenshot attached and 2nd point : Value of dfs.hosts.exclude is as follows

 <property>
      <name>dfs.hosts.exclude</name>
      <value>/etc/hadoop/conf/dfs.exclude</value>
    </property>

8422-screencapture-192-168-11-45-50070-dfshealth-html-1.png

avatar
Super Guru

can you do -

$cat /etc/hadoop/conf/dfs.exclude

avatar
Super Guru

Also what is the size of the data you are uploading to HDFS ? I see the free HDFS space is 450MB approx..

avatar
Rising Star

Point 1 : I can't do cat /etc/hadoop/conf/dfs.exclude as its not a file, I am using ambari to manage the HDFS cluster.

Point 2 : Size is now low, but the issue was happening when HDFS size is like around 20 Gb and I am uploading the files of max 1 Mb.

avatar
Super Guru

you might need to enable hadoop debug mode to get more visibility over the issue-

export hadoop.root.logger=DEBUG

and run the job from cli and test