We are testing various failure modes to ensure no essential data loss. We have a 3 node cluster with a replication factor of 2. To simulate a crash, we power off node C and leave it off for at least 10 minutes. Then we restart Node C. After coming back up, the files that had blocks on Node C show under-replication. I then issue the command
sudo -u hdfs hadoop fs -setrep -w 2 file
to try and fix the under replicated blocks. However, the command never returns. What I see in the data node log on Node C is that Hadoop is trying to write this file to Node C to achieve a replication or 2, but this file already exists on Node C. I get the following error:
2020-05-04 13:57:58,169 INFO datanode.DataNode (DataXceiver.java:run(305)) - vault-svr3.vicads230.local:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.1.31.232:39928 dst: /10.1.31.233:50010; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-165901586-10.1.31.231-1588389285182:blk_1073741928_7575 already exists in state FINALIZED and thus cannot be created.
My question is, what is the best way to bring up a data node after a crash, assuming that the crash did not corrupt data on the disks?