Created 12-10-2015 04:09 AM
Created 12-10-2015 06:17 PM
If you are running an HA NameNode using Quorum Journal Manager, then running the SecondaryNameNode is not required. Actually, it would be incorrect to deploy a SecondaryNameNode alongside an HA NameNode pair.
Before implementation of HA with Quorum Journal Manager, the function of the SecondaryNameNode was to create a periodic checkpoint (a new fsimage file) of the NameNode metadata and upload it back to the NameNode. Without checkpointing, the NameNode's edit log would grow continuously. A very large edit log is problematic, because it slows down NameNode restarts. Replaying a large edit log is much more time-consuming than loading a recent metadata checkpoint and applying a small edit log on top of it.
With an HA deployment, the standby NameNode in the pair takes over the responsibility of periodic checkpointing previously performed by the SecondaryNameNode. Therefore, it is unnecessary (and invalid) to run a SecondaryNameNode. If you choose not to deploy with HA for some reason, then the SecondaryNameNode is recommended so that you get periodic checkpoints.
There is more discussion of this in the Apache documentation on NameNode HA using Quorum Journal Manager, particularly the section on hardware selection.
Created 12-10-2015 01:20 PM
A HA namenode is not mandatory but highly encouraged. You cannot perform rolling upgrades without a HA namenode configured.
Created 12-10-2015 05:47 PM
Secondary NameNode is mandatory in every Hadoop installation. HA NameNode is higly suggested, but is something different from Seconday NameNode. Since Hadoop v2 release you can easily deploy a HA config of the NameNode with an Active and Standby NameNode (requires ZooKeeper). If you've a really small lab cluster you can avoid the HA NameNode config.
Created 12-11-2015 01:59 AM
for really small test environments, you can disable secondary namenode, our sandbox does not have secondary namenode running.
Created 12-10-2015 06:17 PM
If you are running an HA NameNode using Quorum Journal Manager, then running the SecondaryNameNode is not required. Actually, it would be incorrect to deploy a SecondaryNameNode alongside an HA NameNode pair.
Before implementation of HA with Quorum Journal Manager, the function of the SecondaryNameNode was to create a periodic checkpoint (a new fsimage file) of the NameNode metadata and upload it back to the NameNode. Without checkpointing, the NameNode's edit log would grow continuously. A very large edit log is problematic, because it slows down NameNode restarts. Replaying a large edit log is much more time-consuming than loading a recent metadata checkpoint and applying a small edit log on top of it.
With an HA deployment, the standby NameNode in the pair takes over the responsibility of periodic checkpointing previously performed by the SecondaryNameNode. Therefore, it is unnecessary (and invalid) to run a SecondaryNameNode. If you choose not to deploy with HA for some reason, then the SecondaryNameNode is recommended so that you get periodic checkpoints.
There is more discussion of this in the Apache documentation on NameNode HA using Quorum Journal Manager, particularly the section on hardware selection.
Created 12-12-2015 12:51 PM
Thank you all for your answers.