Reply
New Contributor
Posts: 3
Registered: ‎08-22-2018
Accepted Solution

How to remove a large number of nodes from the cluster?

As we all know, files on Hadoop will be copied 3 times to prevent loss. Usually there will be two copies on this rack and one on the other racks. However, when I want to delete a large number of datenode nodes, it is possible to delete the nodes containing 3 copies at the same time.
I have read the help documentation and there are two ways to do it. The link is as follows: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_delete_hosts.html . However, there is no mention of bulk deletion of datenode nodes.
At the same time, I also modified hdfs-site.xml and then refreshed hdfs dfsadmin -refreshNodes, but it has no effect.
  <property>
           <name>dfs.hosts.exclude</name>
           <value>dfshosts.exclude</value>
  </property>
Therefore, I would like to ask technical experts, how to operate on the cloudera, can be deleted in batches, while ensuring that data is not lost? Or go directly to the cluster, use a similar configuration file such as hdfs-site.xml or core-site.xml to achieve the purpose?

Highlighted
Expert Contributor
Posts: 133
Registered: ‎01-08-2018

Re: How to remove a large number of nodes from the cluster?

I think, what you need first is this https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_decomm_host.html If decommission is done successfully (all blocks are available across the remaining Datanodes according to the replication factor) then you can delete the nodes.
New Contributor
Posts: 3
Registered: ‎08-22-2018

Re: How to remove a large number of nodes from the cluster?

After my research, remove the cluster node to operate up to two at a time, otherwise the data is at risk of being lost. And if the number of copies is insufficient, the system will not complete the removal operation, and finally have to retrieve the assigned role again.
Announcements