Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is there a way to take the data -node willing down and add it later and the namenode doesn't replicate or rebalancer the data once node is taken out and added back.

avatar
Expert Contributor

Can MaintenanceMode be the answer?? if yes what happens when a node is kept in maintenance mode.?? How replication works for the data kept in maintenance mode node.?? what happens when i decomission a data node?? and what happens when i delete a datanode???

1 ACCEPTED SOLUTION

avatar

THEORETICALLY.... you could move the underlying files on a particular DataNode and put them on another DataNode, but... you'd have to have that DataNode processes not running during that. When the other DataNode started with the files you have moved it, it will send a block report that contains the blocks you copied. If that was done in pretty tight synchronization with taking down the original DataNode it ~might~, again THEORETICALLY, work, but...

DON'T DO THAT!

Seriously, that is a bunny trail that you would not really want to explore outside of just a learning exercise in a non-production environment to help you understand how the NN and DN processes interoperate.

Maintenance mode is not your answer here as that who premise is to help Ambari Monitoring not send bad alarms about services you intentionally want unavailable.

Generally speaking, decommissioning a DataNode is the way to go as it would give the NameNode the time to not put new blocks on the DataNode being decommissioned while redistributing them so you are never under-replicated. If you just delete the node, then you'll be under-replicated until the NameNode can resolve that for you.

View solution in original post

2 REPLIES 2

avatar

THEORETICALLY.... you could move the underlying files on a particular DataNode and put them on another DataNode, but... you'd have to have that DataNode processes not running during that. When the other DataNode started with the files you have moved it, it will send a block report that contains the blocks you copied. If that was done in pretty tight synchronization with taking down the original DataNode it ~might~, again THEORETICALLY, work, but...

DON'T DO THAT!

Seriously, that is a bunny trail that you would not really want to explore outside of just a learning exercise in a non-production environment to help you understand how the NN and DN processes interoperate.

Maintenance mode is not your answer here as that who premise is to help Ambari Monitoring not send bad alarms about services you intentionally want unavailable.

Generally speaking, decommissioning a DataNode is the way to go as it would give the NameNode the time to not put new blocks on the DataNode being decommissioned while redistributing them so you are never under-replicated. If you just delete the node, then you'll be under-replicated until the NameNode can resolve that for you.

avatar
Explorer

1) Maintenance mode is turned ON at a service/node level. They are turned ON to perform the following activities but not limitied to

  • OS maintenance
  • configuration changes
  • Decommission a node

Generally speaking, when the maintenance mode is switched ON, the alerts are suppressed and no bulk operations are performed on the node. However, the node is still listed in NN's DN list.

2) Decommissioning a DN is highly recommended when the maintenance mode is turned ON (to avoid data loss). When the DN is set to decommissioning state, NN starts copying blocks to other DN's. The DN will be decommissioned only when NN completes the copy process. This activity is performed to maintain the replication factor policy.

3) Deletion of a DN can be performed after successful completion of decommissioning a DN. At this point, DN is completely removed from the cluster and NN's list.

4) 'Rebalancer' is a manual activity performed on the cluster to rebalance the data between the under utilizied and over utilized DN's