While the upgrade I recently did on an 8-node cluster seems to have gone OK (Ambari 184.108.40.206, HDP 3.0.1) I have found that the auto-start of NAMENODE does not seem to be working properly anymore. A reboot of the namenode host leaves HDFS in a permanent "safemode on" state, and the startup does not work anymore like it used to.
The only workaround I know of is either:
(A) stop HDFS/namenode before rebooting
(B) If after an uexpected reboot, you have to turn safemode off or the namenode will not start.
**EDIT** I am also apparently having issues with AUTOSTART in general, even after a *clean* "shut down all services" first before rebooting a cluster. Things just don't come up properly.
I have given up on this for now and have completely disabled the autostart mechanism. I don't believe it's ready for prime-time with the latest ambari and hdp.........................................
Would love to see if anyone else is having issues too with this...?
The issue isn't that the namenode process isn't trying to start. or doesn't exist.... it IS....... the problem is that it does not FINISH starting because safemode keeps it locked. So, the deeper issue here is that HDFS does not properly clear safemode when the namenode process auto-starts.
Ok. Do you have any missing blocks? Namenode gets stuck in safemode if there are any missing blocks and waits until it reaches the threshold value. You can try decreasing the threshold (just for testing , not recommended) and see if the namenode comes up.
So as a limited test, I tried your plan--- with JUST namenode... and it worked. This time. auto-started just fine.
But in a larger sense, when I use auto-start for entire nodes or the whole cluster- nope, the completion of all services starting up never finishes. SOme of the nodes are fine, but I know if I auto-start everything on the namenode host it gets locked up.
I assume that over time other people will start running into this and then we'll have more info on what's failing.