Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to remove a large number of nodes from the cluster?

avatar
New Contributor

As we all know, files on Hadoop will be copied 3 times to prevent loss. Usually there will be two copies on this rack and one on the other racks. However, when I want to delete a large number of datenode nodes, it is possible to delete the nodes containing 3 copies at the same time.
I have read the help documentation and there are two ways to do it. The link is as follows: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_delete_hosts.html . However, there is no mention of bulk deletion of datenode nodes.
At the same time, I also modified hdfs-site.xml and then refreshed hdfs dfsadmin -refreshNodes, but it has no effect.
  <property>
           <name>dfs.hosts.exclude</name>
           <value>dfshosts.exclude</value>
  </property>
Therefore, I would like to ask technical experts, how to operate on the cloudera, can be deleted in batches, while ensuring that data is not lost? Or go directly to the cluster, use a similar configuration file such as hdfs-site.xml or core-site.xml to achieve the purpose?

1 ACCEPTED SOLUTION

avatar
New Contributor
After my research, remove the cluster node to operate up to two at a time, otherwise the data is at risk of being lost. And if the number of copies is insufficient, the system will not complete the removal operation, and finally have to retrieve the assigned role again.

View solution in original post

2 REPLIES 2

avatar
Super Collaborator
I think, what you need first is this https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_decomm_host.html If decommission is done successfully (all blocks are available across the remaining Datanodes according to the replication factor) then you can delete the nodes.

avatar
New Contributor
After my research, remove the cluster node to operate up to two at a time, otherwise the data is at risk of being lost. And if the number of copies is insufficient, the system will not complete the removal operation, and finally have to retrieve the assigned role again.