Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

All three namenode directories are pointing to same metadata.

avatar
Rising Star

All three namenode directories are pointing to same metadata.


/hadoop/hdfs/namenode

/var/hadoop/hdfs/namenode

/var/log/hadoop/hdfs/namenode

What is the use of having 3 directories pointing to same data?

1 ACCEPTED SOLUTION

avatar

While I'm doubtful these three directories are the very best answer to this problem, but the old "three directories for the NN metadata" came about long before a solid HA solution was available and as https://twitter.com/LesterMartinATL/status/527340416002453504 points out, it was (and actually still is) all about disaster recovery. The old adage was to configure the NN to write to three different disks (via the directories) -- two local and one off the box such as a remote mount point. Why?

Well... as you know that darn metadata is the keys to the whole file system and if it ever gets lost then ALL of your data is non-recoverable!!

I personally think this is still valuable even with HA as the JournalNodes are focused on the edits files and do a great job of having that information on multiple machines, but the checkpoint image files only exist on the two NN nodes in HA configuration and, well... I just like to sleep better at night.

Good luck and happy Hadooping!

View solution in original post

1 REPLY 1

avatar

While I'm doubtful these three directories are the very best answer to this problem, but the old "three directories for the NN metadata" came about long before a solid HA solution was available and as https://twitter.com/LesterMartinATL/status/527340416002453504 points out, it was (and actually still is) all about disaster recovery. The old adage was to configure the NN to write to three different disks (via the directories) -- two local and one off the box such as a remote mount point. Why?

Well... as you know that darn metadata is the keys to the whole file system and if it ever gets lost then ALL of your data is non-recoverable!!

I personally think this is still valuable even with HA as the JournalNodes are focused on the edits files and do a great job of having that information on multiple machines, but the checkpoint image files only exist on the two NN nodes in HA configuration and, well... I just like to sleep better at night.

Good luck and happy Hadooping!