Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

[HDFS]Can JournalNode handling more 50% datanode failure?

avatar
Rising Star

Hi Folks,

I have 2 NameNode (HA), 16 DataNode, and 5 JournalNode. When I tried to shutdown 8 DataNode, 2 NameNode was down and not running.

Please give me a solution for my case. How many journal nodes are needed to be able to handle a minimum of 8 DataNodes that are down or have a greater than 50% failure rate?

Pleas Advice.....

4 REPLIES 4

avatar
Master Mentor

@rizalt 
Can you share your layout of the 18 hosts to better understand where the issue could be emanating from?

The issue you are experiencing, where shutting down 8 DataNodes causes both NameNodes in your high availability (HA) configuration to go down, likely points to Quorum loss in the JournalNodes or insufficient replicas for critical metadata blocks.

The NameNodes in HA mode rely on JournalNodes for shared edits. For the HA setup to function correctly, the JournalNodes need a quorum (more than half) to be available.
With 5 JournalNodes, at least 3 must be operational. If shutting down 8 DataNodes impacted the connectivity or availability of more than 2 JournalNodes, the quorum would be lost, causing both NameNodes to stop functioning.
If shutting down 8 DataNodes reduces the number of replicas below the replication factor (typically 3), the metadata might not be available, causing the NameNodes to fail.

Please revert

avatar
Rising Star

Thanks for the reply @Shelton 
This is my layout of 2 Data Centers, each data center has 1 NameNode and 8 datanodes, in my case when 1 DC dies, HA continues to run

rizalt_0-1734418029269.png

 

avatar
Rising Star

Please Help me @Shelton , what is the maximum datanode failure percentage? I tried to install 11 JN, and 11 ZK, but it didn't work, out of 16 nodes only 7 datanodes can fail active or dead, I need 8 datanodes dead but HA still running

avatar
Expert Contributor

@rizalt FYI

➤It appears your cluster is experiencing a Quorum Failure, where the critical metadata services (NameNode, JournalNodes, and ZooKeeper) are losing the ability to maintain a majority when one Data Center (DC) goes offline.

➤ Analyzing Your Failure
In a High Availability (HA) setup, the NameNodes rely on a "Quorum" of JournalNodes (JN) and ZooKeeper (ZK) nodes to stay alive. If you have 5 JNs, you must have at least 3 running for either NameNode to function.

Based on your diagram:

The Problem: If you split your 5 JournalNodes evenly (e.g., 2 in DC1 and 3 in DC2), and the DC with 3 JNs goes down, the remaining 2 JNs cannot form a quorum. This causes both NameNodes to shut down immediately to prevent data corruption ("Split Brain").

DataNodes vs. NameNodes: In HDFS, the number of DataNodes (DN) that fail has no direct impact on whether the NameNode stays "Up" or "Down." You could lose 15 out of 16 DataNodes, and the NameNode should still stay active. The fact that your NameNodes are crashing when 8 DNs (one full server/DC) go down proves that your Quorum nodes (JN/ZK) are failing, not the DataNodes.

➤ The Solution: Quorum Placement
To survive a full Data Center failure (50% of your physical infrastructure), you cannot rely on an even split of nodes. You need a third location (a "Witness" site) or an asymmetric distribution.

1. The 3-Site Strategy (Recommended)
To handle a 1-DC failure with 5 JournalNodes and 5 ZooKeeper nodes, place them as follows:

-DC 1: 2 JN, 2 ZK
-DC 2: 2 JN, 2 ZK
-Site 3 (Witness): 1 JN, 1 ZK (This can be a very small virtual machine or cloud instance).

Why this works: If DC1 or DC2 fails, the remaining site + the Witness site equals 3 nodes, which satisfies the quorum ($3 > 5/2$).

2. Maximum DataNode Failure

Theoretically: You can lose all but 1 DataNode ($N-1$), and the NameNode will stay "Active."

Practically: If you have a replication factor of 3, and you lose 50% of your nodes, many blocks will become "Under-replicated," and some may become "Missing" if all three copies were in the DC that died.

Solution: Ensure your Rack Awareness is configured so HDFS knows which nodes belong to which DC. This forces HDFS to keep at least one copy of data in each DC.

➤ Why 11 JN and 11 ZK didn't work
Increasing the number of nodes to 11 actually makes the cluster more fragile if they are only placed in two locations.

With 11 nodes, you need 6 to be alive to form a quorum.

If you have 5 in DC1 and 6 in DC2, and DC2 fails, the 5 remaining nodes in DC1 cannot reach the 6-node requirement.

➤ Checklist for Survival
Reduce to 5 JNs and 5 ZKs: Too many nodes increase network latency and management overhead.


Add a 3rd Location: Even a single low-power node in a different building or cloud region to act as the "tie-breaker."

Check dfs.namonode.edits.dir: Ensure the NameNodes are configured to point to all JournalNodes by URI.

ZooKeeper FC: Ensure DFS ZK Failover Controller is running on both NameNode hosts.