As we all know, files on Hadoop will be copied 3 times to prevent loss.Usually there will be two copies on this rack and one on the other racks.However, when I want to delete a large number of datenode nodes, it is possible to delete the nodes containing 3 copies at the same time. I have read the help documentation and there are two ways to do it. The link is as follows: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_delete_hosts.html . However, there is no mention of bulk deletion of datenode nodes. At the same time, I also modified hdfs-site.xml and then refreshed hdfs dfsadmin -refreshNodes, but it has no effect. <property> <name>dfs.hosts.exclude</name> <value>dfshosts.exclude</value> </property> Therefore, I would like to ask technical experts, how to operate on the cloudera, can be deleted in batches, while ensuring that data is not lost? Or go directly to the cluster, use a similar configuration file such as hdfs-site.xml or core-site.xml to achieve the purpose?
When I was in the similar siutation we tried the decommission in batches (3 to 4 nodes at a time). CM/Hadoop will internelly take care of the replication (for the deleted data from decommission node), so no need to worry about the replication part.
So below are my recommendations
1. Decommission the nodes by batches
2. CM -> HDFS -> Web UI -> Datanods -> Decommissioning -> Make sure there is no under replicated blocks
3. CM -> All Hosts -> Commission state(left side menu) -> Decommissioned -> Make sure your host is listed
4. If you have time, wait for 1 or 2 days or more then delete the host from CM