Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to restart Namenode and standby Namenode?

avatar
Expert Contributor

I deployed Namenode HA in HDP2.5.

Is there a method or steps to restart namenode which won't put namenode into safemode or into safemode with less duration as possible?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Huahua Wei

I am going to presume that the intend of you question is to reduce the downtime of Namenode while you have HA for the clients. I will answer that first and follow up with the specific answer to SafeMode question.

The whole point of HA is to minimize the downtime of Namenodes. Here is what you should do. You should fail over to the Standby namenode and generally it should be very fast, maximum couple of retries from clients and they will be able to continue working against the cluster without any detectable loss of availability. That is, most of your YARN jobs should be fine. If this is not happening, some of your configs for Zookeeper or HDFS is not optimal.

If you are specifically asking about Safemode, Safemode is used to make sure that enough datanodes and blocks from those datanodes have reported in. You can set a lower threshold for this -- The namenode when it meets that threshold will exit the safemode. There is a slider for this threshold in the Ambari UI for HDFS settings.

If you really want to tell namenode to ignore this whole waiting for datanodes and block reports -- you are free to do so, there is a command line option which allows you to run "Safemode exit". Both of these are not really good things for your cluster. Spending a bit of time when the cluster is starting up for the first time is actually a good idea, so that you know that your data and nodes are in good shape. But yes, you can reduce the Safemode waiting time via config or via command line.

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

@Huahua Wei I dont think so.Transition to Failover node does not take much of time. Whichever NameNode is started first will become active. You may choose to start the cluster in a specific order such that your preferred node starts first.

What problem exactly are you facing ? Is it the failover or starting up the namenode is taking lot of time?

avatar
Expert Contributor

@Huahua Wei

I am going to presume that the intend of you question is to reduce the downtime of Namenode while you have HA for the clients. I will answer that first and follow up with the specific answer to SafeMode question.

The whole point of HA is to minimize the downtime of Namenodes. Here is what you should do. You should fail over to the Standby namenode and generally it should be very fast, maximum couple of retries from clients and they will be able to continue working against the cluster without any detectable loss of availability. That is, most of your YARN jobs should be fine. If this is not happening, some of your configs for Zookeeper or HDFS is not optimal.

If you are specifically asking about Safemode, Safemode is used to make sure that enough datanodes and blocks from those datanodes have reported in. You can set a lower threshold for this -- The namenode when it meets that threshold will exit the safemode. There is a slider for this threshold in the Ambari UI for HDFS settings.

If you really want to tell namenode to ignore this whole waiting for datanodes and block reports -- you are free to do so, there is a command line option which allows you to run "Safemode exit". Both of these are not really good things for your cluster. Spending a bit of time when the cluster is starting up for the first time is actually a good idea, so that you know that your data and nodes are in good shape. But yes, you can reduce the Safemode waiting time via config or via command line.

avatar
Master Guru

@Huahua Wei

@aengineer has given a very good explanation of why NN needs safemode and why you should not leave the safemode forcefully until and unless its necessary.

If you have NN A and B. Currently A is active and you need to restart A for some reason. You can always do the failover to B before restart and then once A becomes standby, you can restart it without any downtime.

Command to failover is:

sudo -u hdfs hdfs haadmin -failover nn1 nn2

Note - This will failover nn1 to nn2(nn2 will become active)

Hope this information helps.

avatar
Expert Contributor

@aengineer @Kuldeep Kulkarni Thanks for your reseponse. One more question, Will standby namenode into safemode whenever restart it?