Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

All three namenode directories are pointing to same metadata.

Solved Go to solution

All three namenode directories are pointing to same metadata.

Contributor

All three namenode directories are pointing to same metadata.


/hadoop/hdfs/namenode

/var/hadoop/hdfs/namenode

/var/log/hadoop/hdfs/namenode

What is the use of having 3 directories pointing to same data?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: All three namenode directories are pointing to same metadata.

While I'm doubtful these three directories are the very best answer to this problem, but the old "three directories for the NN metadata" came about long before a solid HA solution was available and as https://twitter.com/LesterMartinATL/status/527340416002453504 points out, it was (and actually still is) all about disaster recovery. The old adage was to configure the NN to write to three different disks (via the directories) -- two local and one off the box such as a remote mount point. Why?

Well... as you know that darn metadata is the keys to the whole file system and if it ever gets lost then ALL of your data is non-recoverable!!

I personally think this is still valuable even with HA as the JournalNodes are focused on the edits files and do a great job of having that information on multiple machines, but the checkpoint image files only exist on the two NN nodes in HA configuration and, well... I just like to sleep better at night.

Good luck and happy Hadooping!

1 REPLY 1
Highlighted

Re: All three namenode directories are pointing to same metadata.

While I'm doubtful these three directories are the very best answer to this problem, but the old "three directories for the NN metadata" came about long before a solid HA solution was available and as https://twitter.com/LesterMartinATL/status/527340416002453504 points out, it was (and actually still is) all about disaster recovery. The old adage was to configure the NN to write to three different disks (via the directories) -- two local and one off the box such as a remote mount point. Why?

Well... as you know that darn metadata is the keys to the whole file system and if it ever gets lost then ALL of your data is non-recoverable!!

I personally think this is still valuable even with HA as the JournalNodes are focused on the edits files and do a great job of having that information on multiple machines, but the checkpoint image files only exist on the two NN nodes in HA configuration and, well... I just like to sleep better at night.

Good luck and happy Hadooping!