Created 10-10-2016 01:58 PM
Hi,
I am facing the issue while uploading files from local file system to HDFS using java api method as fs.copyFromLocalFile(new Path("D://50GBTtest"), new Path("/50GBTest")); after around 1000 files uploaded, I got an exception saying as follows,
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream INFO: Abandoning BP-1106394772-192.168.11.45-1476099033336:blk_1073743844_3020 Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream INFO: Excluding datanode 192.168.11.45:50010 Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run WARNING: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /50GBTest/50GBTtest/disk10544.doc could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1640) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3161) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3085) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:830) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267) at org.apache.hadoop.ipc.Client.call(Client.java:1411) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy7.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy7.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
I have checked that 1 datanode is running and has lot of free space as well. I have kept another datanode down due to some scenario test.
Can anybody have the ideas ? or faced this issue in past ?
Created 10-10-2016 02:34 PM
adding my datanode health status as comment,
[hdfs@hadoop3ind1 root]$ hdfs dfsadmin -report Configured Capacity: 315522809856 (293.85 GB) Present Capacity: 288754439992 (268.92 GB) DFS Remaining: 285925896192 (266.29 GB) DFS Used: 2828543800 (2.63 GB) DFS Used%: 0.98% Under replicated blocks: 1475 Blocks with corrupt replicas: 1 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (2): Name: 192.168.11.47:50010 (hadoop3ind1.india) Hostname: hadoop3ind1.india Decommission Status : Normal Configured Capacity: 157761404928 (146.93 GB) DFS Used: 1563369472 (1.46 GB) Non DFS Used: 11485380608 (10.70 GB) DFS Remaining: 144712654848 (134.77 GB) DFS Used%: 0.99% DFS Remaining%: 91.73% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Mon Oct 10 20:02:06 IST 2016 Name: 192.168.11.45:50010 (hadoop1ind1.india) Hostname: hadoop1ind1.india Decommission Status : Normal Configured Capacity: 157761404928 (146.93 GB) DFS Used: 1265174328 (1.18 GB) Non DFS Used: 15282989256 (14.23 GB) DFS Remaining: 141213241344 (131.52 GB) DFS Used%: 0.80% DFS Remaining%: 89.51% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Mon Oct 10 20:02:05 IST 2016
Created 10-11-2016 06:10 AM
@David Streever : Can you please have a look at this ? I have also attached my data node details in above comment
I am stuck here. No more files are being uploaded to any of the data node. Also I have observed that if both datanode is up and running than process doesn't throw an exception, but if one of them goes down, than after some time this exception is thrown, how to resolve this ?
Is it like, if one datanode is not communicating with other for few minutes,then this whole upload process can be stopped after that time.
Created 10-11-2016 10:20 AM
Can you pass the screenshot if Namenode UI ?
Also can you check if there is value in this file or property from ambari for "dfs.hosts.exclude"
cat /etc/hadoop/conf/dfs.hosts.exclude
Created on 10-11-2016 11:20 AM - edited 08-19-2019 03:21 AM
@Sagar Shimpi : Please have a look at the screenshot attached and 2nd point : Value of dfs.hosts.exclude is as follows
<property> <name>dfs.hosts.exclude</name> <value>/etc/hadoop/conf/dfs.exclude</value> </property>
Created 10-11-2016 12:32 PM
can you do -
$cat /etc/hadoop/conf/dfs.exclude
Created 10-11-2016 12:35 PM
Also what is the size of the data you are uploading to HDFS ? I see the free HDFS space is 450MB approx..
Created 10-11-2016 01:19 PM
Point 1 : I can't do cat /etc/hadoop/conf/dfs.exclude as its not a file, I am using ambari to manage the HDFS cluster.
Point 2 : Size is now low, but the issue was happening when HDFS size is like around 20 Gb and I am uploading the files of max 1 Mb.
Created 10-11-2016 04:19 PM
you might need to enable hadoop debug mode to get more visibility over the issue-
export hadoop.root.logger=DEBUG
and run the job from cli and test