Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

... could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) run

Contributor

 

 

Hi, I setup an HA CDH cluster on premises with Cloudera Manager. However when I try to test the cluster by running the mapreduce examples I get:

 

[root@hadoop-masternode01 ~]# sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
Number of Maps  = 10
Samples per Map = 100
17/02/15 13:13:30 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs/QuasiMonteCarlo_1487160808806_1797399645/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and no node(s) are excluded in this operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

	at org.apache.hadoop.ipc.Client.call(Client.java:1472)
	at org.apache.hadoop.ipc.Client.call(Client.java:1409)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1812)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1608)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:772)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs/QuasiMonteCarlo_1487160808806_1797399645/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and no node(s) are excluded in this operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

	at org.apache.hadoop.ipc.Client.call(Client.java:1472)
	at org.apache.hadoop.ipc.Client.call(Client.java:1409)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at com.sun.proxy.$Proxy16.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1812)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1608)
	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:772)

 

 

In the datanodes logs file I see this:

 

2017-02-15 12:54:06,674 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2017-02-15 12:54:06,674 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2017-02-15 12:54:07,783 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode01.example.net/10.10.10.206:8022. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:07,783 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode02.example.net/10.10.10.207:8022. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:08,784 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode02.example.net/10.10.10.207:8022. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:08,784 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode01.example.net/10.10.10.206:8022. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:09,786 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode01.example.net/10.10.10.206:8022. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:09,786 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-masternode02.example.net/10.10.10.207:8022. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-02-15 12:54:10,305 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 6 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=6, dataDirs=6)
2017-02-15 12:54:10,305 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 6 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=6, dataDirs=6)
2017-02-15 12:54:10,358 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/01/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,401 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/02/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,436 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/03/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,471 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/04/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,506 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/05/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,540 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/06/dfs/dn/in_use.lock acquired by nodename 26139@hadoop-datanode01.example.net
2017-02-15 12:54:10,608 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-143250023-10.10.10.206-1487064960727
2017-02-15 12:54:10,609 INFO org.apache.hadoop.hdfs.server.common.Storage: Locking is disabled for /data/01/dfs/dn/current/BP-143250023-10.10.10.206-1487064960727
2017-02-15 12:54:10,661 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-143250023-10.10.10.206-1487064960727
2017-02-15 12:54:10,662 INFO org.apache.hadoop.hdfs.server.common.Storage: Locking is disabled for /data/02/dfs/dn/current/BP-143250023-10.10.10.206-1487064960727

However I can do "curl -v 10.10.10.207:8022" and "curl -v hadoop-masternode02.example.net:8022" from the datanode, which shows the port as open.

 

 

The hdfs report command shows that things are looking good:

 

[root@hadoop-masternode01 ~]# sudo -u hdfs hdfs dfsadmin -report
Configured Capacity: 34356749524992 (31.25 TB)
Present Capacity: 34356749524992 (31.25 TB)
DFS Remaining: 34356749303808 (31.25 TB)
DFS Used: 221184 (216 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 10.10.10.214:50010 (hadoop-datanode03.example.net)
Hostname: hadoop-datanode03.example.net
Rack: /default
Decommission Status : Normal
Configured Capacity: 11452249841664 (10.42 TB)
DFS Used: 73728 (72 KB)
Non DFS Used: 0 (0 B)
DFS Remaining: 11452249767936 (10.42 TB)
DFS Used%: 0.00%
DFS Remaining%: 100.00%
Configured Cache Capacity: 4294967296 (4 GB)
Cache Used: 0 (0 B)
Cache Remaining: 4294967296 (4 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 2
Last contact: Wed Feb 15 13:14:37 CET 2017


Name: 10.10.10.212:50010 (hadoop-datanode01.example.net)
Hostname: hadoop-datanode01.example.net
Rack: /default
Decommission Status : Normal
Configured Capacity: 11452249841664 (10.42 TB)
DFS Used: 73728 (72 KB)
Non DFS Used: 0 (0 B)
DFS Remaining: 11452249767936 (10.42 TB)
DFS Used%: 0.00%
DFS Remaining%: 100.00%
Configured Cache Capacity: 4294967296 (4 GB)
Cache Used: 0 (0 B)
Cache Remaining: 4294967296 (4 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 2
Last contact: Wed Feb 15 13:14:37 CET 2017


Name: 10.10.10.213:50010 (hadoop-datanode02.example.net)
Hostname: hadoop-datanode02.example.net
Rack: /default
Decommission Status : Normal
Configured Capacity: 11452249841664 (10.42 TB)
DFS Used: 73728 (72 KB)
Non DFS Used: 0 (0 B)
DFS Remaining: 11452249767936 (10.42 TB)
DFS Used%: 0.00%
DFS Remaining%: 100.00%
Configured Cache Capacity: 4294967296 (4 GB)
Cache Used: 0 (0 B)
Cache Remaining: 4294967296 (4 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 2
Last contact: Wed Feb 15 13:14:34 CET 2017

 

I can browse HDFS and create/delete files. 

I see also that the full dns name is specified in core-site.xml.

Also there are FQDN entries in /etc/hosts for all nodes on each host.

SElinux is in permissve and firewall is disabled.

Tried formatting the namenode - didn't helped.

 

I'm out of ideas on what could be wrong.

Who agreed with this topic