Support Questions

Find answers, ask questions, and share your expertise

Replica Not FoundException

avatar
Expert Contributor
DatanodeRegistration(172.31.4.192, datanodeUuid=5d7a5533-df53-454e-bfb3-2dfcdbfb7b1b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster2;nsid=725965767;c=0):Got exception while serving BP-1423177047-172.31.4.192-1492091038346:blk_1118810958_45073263 to /172.31.10.74:44406
org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-1423177047-172.31.4.192-1492091038346:blk_1118810958_45073263
	at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:466)
	at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:241)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:537)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
	at java.lang.Thread.run(Thread.java:745)

 

Although my datanodes are working fine I see this error in Diagnostics. Could this be a problem?

1 ACCEPTED SOLUTION

avatar
Champion

@sim6

 

I hope you have more than 3 data nodes

 

Generally there two types of "data missing" issues are possible for many reasons
a. ReplicaNotFoundException
b. BlockMissingException

 

If your issue is related to BlockMissingException and if you have backup data in your DR environment then you are good otherwise it might be a problem, but for ReplicaNotFoundException, please make sure all your datanodes are healthy and commissioned state. In fact, namenode suppose to handle this automatically whenever a hit occurs on that data.. if not, you can also try hdfs rebalance (or) NN restart may fix this issue, but you don't need to try this option unless some user report any issue on the particular data. In your case no one reported yet and you found it, so you can ignore it for now

View solution in original post

1 REPLY 1

avatar
Champion

@sim6

 

I hope you have more than 3 data nodes

 

Generally there two types of "data missing" issues are possible for many reasons
a. ReplicaNotFoundException
b. BlockMissingException

 

If your issue is related to BlockMissingException and if you have backup data in your DR environment then you are good otherwise it might be a problem, but for ReplicaNotFoundException, please make sure all your datanodes are healthy and commissioned state. In fact, namenode suppose to handle this automatically whenever a hit occurs on that data.. if not, you can also try hdfs rebalance (or) NN restart may fix this issue, but you don't need to try this option unless some user report any issue on the particular data. In your case no one reported yet and you found it, so you can ignore it for now