Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Secondary NameNode, CheckpointNode or BackupNode?

Solved Go to solution

Secondary NameNode, CheckpointNode or BackupNode?

Explorer

Hi,

I am just getting into hadoop and HDFS. I am very confused how HDFS handles data loss, in case the NameNode fails. In the documentation i found three mechanisms to prevent data loss. Secondary NameNode, CheckpointNode and BackupNode. I understood the differences between them, but I am not sure if CheckpointNode and BackupNode are depreciated, since I cant find them in my Hortonworks distribution.

I also understood, that none of them is neccessary if you deploy hadoop HA. Is there any guideline which of the nodes should be used in production?

thank you for your answers

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Secondary NameNode, CheckpointNode or BackupNode?

I have never seen Checkpoint/Backup node being used in practice and these should be considered deprecated. I recommend using NameNode HA which eliminates the single point of failure. More details in below post:

https://community.hortonworks.com/questions/85184/checkpoint-node-vs-secondary-namenode.html

3 REPLIES 3

Re: Secondary NameNode, CheckpointNode or BackupNode?

I have never seen Checkpoint/Backup node being used in practice and these should be considered deprecated. I recommend using NameNode HA which eliminates the single point of failure. More details in below post:

https://community.hortonworks.com/questions/85184/checkpoint-node-vs-secondary-namenode.html

Re: Secondary NameNode, CheckpointNode or BackupNode?

Explorer

In a typical HA cluster, two separate machines are configured as NameNodes. At any point in time, exactly one of the NameNodes is in an Active state, and the other is in a Standbystate. The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover if necessary.

In order for the Standby node to keep its state synchronized with the Active node, both nodes communicate with a group of separate daemons called “JournalNodes” (JNs). When any namespace modification is performed by the Active node, it durably logs a record of the modification to a majority of these JNs. The Standby node is capable of reading the edits from the JNs, and is constantly watching them for changes to the edit log. As the Standby Node sees the edits, it applies them to its own namespace. In the event of a failover, the Standby will ensure that it has read all of the edits from the JounalNodes before promoting itself to the Active state. This ensures that the namespace state is fully synchronized before a failover occurs.

In order to provide a fast failover, it is also necessary that the Standby node have up-to-date information regarding the location of blocks in the cluster. In order to achieve this, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.

It is vital for the correct operation of an HA cluster that only one of the NameNodes be Active at a time. Otherwise, the namespace state would quickly diverge between the two, risking data loss or other incorrect results. In order to ensure this property and prevent the so-called “split-brain scenario,” the JournalNodes will only ever allow a single NameNode to be a writer at a time. During a failover, the NameNode which is to become active will simply take over the role of writing to the JournalNodes, which will effectively prevent the other NameNode from continuing in the Active state, allowing the new Active to safely proceed with failover.

Reference:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.ht...

Re: Secondary NameNode, CheckpointNode or BackupNode?

Secondary NameNode in hadoop is a specially dedicated node in HDFS cluster whose main function is to take checkpoints of the file system metadata present on namenode. It is not a backup namenode. It just checkpoints namenode’s file system namespace. The Secondary NameNode is a helper to the primary NameNode but not replace for primary namenode. As the NameNode is the single point of failure in HDFS

Ref: http://hadooptutorial.info/tag/secondary-namenode-functions/

Don't have an account?
Coming from Hortonworks? Activate your account here