Created 10-10-2016 05:00 PM
Data Center guys are taking down one of my data node for battery replacement. This maintenance is going to take 90 minutes. Through Ambari, i am going to put the node in the maintenance mode and bring down all the services.
Is this sufficient ?
Created 10-11-2016 01:17 AM
Assuming HDFS replication factor > 1 (default is 3), put the node under maintenance and stop services running on the node. Once the server comes back up, start services and take the node out of maintenance, in that order. Putting the node under maintenance before stopping the services will eliminate the risk of alerts. Starting the services before taking the node out of maintenance will prevent the alerts as well.
It is unlikely that your data node will remain that much behind but you may consider HDFS rebalancing to your threshold (default is 10%).
+++++
If any of the responses helped, please vote and accept best answer.
Created 10-10-2016 08:04 PM
@Kumar Veerappan
If you taking only one data node down then you don't need to take down time, because its not going to harm data on that node falls under under replicated category framework will take care of replicating them back when node comes back, it'll get new data as an end user, there shouldn't be any issue.
If you take down time for whole cluster then server anything missing data error until data nodes re-registers back.
Created 10-11-2016 01:17 AM
Assuming HDFS replication factor > 1 (default is 3), put the node under maintenance and stop services running on the node. Once the server comes back up, start services and take the node out of maintenance, in that order. Putting the node under maintenance before stopping the services will eliminate the risk of alerts. Starting the services before taking the node out of maintenance will prevent the alerts as well.
It is unlikely that your data node will remain that much behind but you may consider HDFS rebalancing to your threshold (default is 10%).
+++++
If any of the responses helped, please vote and accept best answer.
Created on 04-05-2017 03:12 PM - edited 08-19-2019 03:21 AM
I thought the proper way to do the maintenance on the data node is to decommission it, so it can do the following tasks:
In a urgent situation, I could agree on your suggestion.
However, please advise me the right approach in a scenario where you have luxury to choose the maintenance window.
Created 11-02-2016 08:57 AM
please vote and accept best answer.