Support Questions

viruvekariya · ‎10-10-2016

Hi,

I am facing the issue while uploading files from local file system to HDFS using java api method as fs.copyFromLocalFile(new Path("D://50GBTtest"), new Path("/50GBTest")); after around 1000 files uploaded, I got an exception saying as follows,

Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Abandoning BP-1106394772-192.168.11.45-1476099033336:blk_1073743844_3020
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Excluding datanode 192.168.11.45:50010
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /50GBTest/50GBTtest/disk10544.doc could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1640)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3161)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3085)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:830)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)

I have checked that 1 datanode is running and has lot of free space as well. I have kept another datanode down due to some scenario test.

Can anybody have the ideas ? or faced this issue in past ?

viruvekariya · ‎10-10-2016

adding my datanode health status as comment,

[hdfs@hadoop3ind1 root]$ hdfs dfsadmin -report
Configured Capacity: 315522809856 (293.85 GB)
Present Capacity: 288754439992 (268.92 GB)
DFS Remaining: 285925896192 (266.29 GB)
DFS Used: 2828543800 (2.63 GB)
DFS Used%: 0.98%
Under replicated blocks: 1475
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.11.47:50010 (hadoop3ind1.india)
Hostname: hadoop3ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1563369472 (1.46 GB)
Non DFS Used: 11485380608 (10.70 GB)
DFS Remaining: 144712654848 (134.77 GB)
DFS Used%: 0.99%
DFS Remaining%: 91.73%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:06 IST 2016
Name: 192.168.11.45:50010 (hadoop1ind1.india)
Hostname: hadoop1ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1265174328 (1.18 GB)
Non DFS Used: 15282989256 (14.23 GB)
DFS Remaining: 141213241344 (131.52 GB)
DFS Used%: 0.80%
DFS Remaining%: 89.51%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:05 IST 2016

viruvekariya · ‎10-11-2016

@David Streever : Can you please have a look at this ? I have also attached my data node details in above comment

I am stuck here. No more files are being uploaded to any of the data node. Also I have observed that if both datanode is up and running than process doesn't throw an exception, but if one of them goes down, than after some time this exception is thrown, how to resolve this ?

Is it like, if one datanode is not communicating with other for few minutes,then this whole upload process can be stopped after that time.

sshimpi · ‎10-11-2016

@Viraj Vekaria

Can you pass the screenshot if Namenode UI ?

Also can you check if there is value in this file or property from ambari for "dfs.hosts.exclude"

cat /etc/hadoop/conf/dfs.hosts.exclude

viruvekariya · ‎10-11-2016

@Sagar Shimpi : Please have a look at the screenshot attached and 2nd point : Value of dfs.hosts.exclude is as follows

 <property>
      <name>dfs.hosts.exclude</name>
      <value>/etc/hadoop/conf/dfs.exclude</value>
    </property>

8422-screencapture-192-168-11-45-50070-dfshealth-html-1.png

sshimpi · ‎10-11-2016

can you do -

$cat /etc/hadoop/conf/dfs.exclude

sshimpi · ‎10-11-2016

Also what is the size of the data you are uploading to HDFS ? I see the free HDFS space is 450MB approx..

viruvekariya · ‎10-11-2016

Point 1 : I can't do cat /etc/hadoop/conf/dfs.exclude as its not a file, I am using ambari to manage the HDFS cluster.

Point 2 : Size is now low, but the issue was happening when HDFS size is like around 20 Gb and I am uploading the files of max 1 Mb.

sshimpi · ‎10-11-2016

you might need to enable hadoop debug mode to get more visibility over the issue-

export hadoop.root.logger=DEBUG

and run the job from cli and test

Cloudera Community

Support Questions

replicated to 0 nodes instead of minreplication HDFS

PolyBase and Cloudera-Error: File could only be re...

Accumulo write-ahead log file could only be replic...

Does HDFS 3x replication still make sense?

How to use CDP Replication Manager to replicate da...

HDFS Data Durability and Availability with replica...

Resource Manager shows 0 active nodes and Total Me...

HDFS capacity is 0

"File /user/root/tmp/test.txt" could only be repli...

YARN cluster is not recognizing any active NodeMan...

HDFS Journal Node edits health checker