HDFS has a master/slave architecture a cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients while the data node is the slave process of the HDFS.
HDFS HA is the recommended implementation in a production environment that addresses the above problems by providing the option of running two NameNodes in the same cluster, in an active/passive configuration. This was the SPOF (Single Point of Failure) in hadoop 1.0 now in Hadoop 2.0 that is the standard production setup These are referred to as the active NameNode and the standby NameNode. Unlike the Secondary NameNode, the standby NameNode is hot standby, allowing a fast automatic failover to a new NameNode in the case that a host crashes, or a graceful administrator-initiated failover for the purpose of planned maintenance
Of recent hadoop 3.0 introduced the name node federation where you can have a NameNode federation just like Active Directory Federation Services whereby the failover in case of non-availability of the AD/Namenode component is transparent to the end user because the name node uses a Namespace which is like a load balancer all requests are sent to the Namespace and resolved to the active name node
See attached screenshot for illustration
Can you update this thread, it's always nice to know whether my response helped and if it didn't what is the news status.
In the event, it helped please take time and "Accept" the response to close the thread and enable other members to reference it