Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

namenode metadata directories

avatar
Expert Contributor

Hi,

I am trying to understand how Namenode metadata is written in the directories. So if we have 2 directories in the Namenode metadata directory list, that means meta data is written parallely in both directories for redundancy? which means 2 copies of metadata is present ?

Thanks.

1 ACCEPTED SOLUTION

avatar
Super Guru

@PJ

If you are just looking for redundancy then it is achieved by writing namenode metadata on journal nodes (typically three), and both standby and active name node point to same journal nodes. When active namenode goes down, Zookeeper, simply needs to make standby node active and it already pointing to same data which is replicated on three journal nodes.

If you don't have journal nodes and you have only one namenode, then your namenode metadata is written only once but here it is recommended that you use RAID 10 array so one disk failure is not going to result in data loss.

To answer your question whether two copies of metadata are present, the answer is it depends. If you are using RAID 10 then your disk array is making a copy of blocks but that's not really a copy in the sense you are asking.

If High Availability is enabled and you are using journal nodes, then you do have three copies of metadata available on three different nodes.

View solution in original post

3 REPLIES 3

avatar
Super Guru

@PJ

If you are just looking for redundancy then it is achieved by writing namenode metadata on journal nodes (typically three), and both standby and active name node point to same journal nodes. When active namenode goes down, Zookeeper, simply needs to make standby node active and it already pointing to same data which is replicated on three journal nodes.

If you don't have journal nodes and you have only one namenode, then your namenode metadata is written only once but here it is recommended that you use RAID 10 array so one disk failure is not going to result in data loss.

To answer your question whether two copies of metadata are present, the answer is it depends. If you are using RAID 10 then your disk array is making a copy of blocks but that's not really a copy in the sense you are asking.

If High Availability is enabled and you are using journal nodes, then you do have three copies of metadata available on three different nodes.

avatar
Expert Contributor
@mqureshi

Thanks so much for the information. But I am still confused with the directories for

NameNode directories and dfs.journalnode.edits.dir

Where are the actual edits and fsimage exists ?

avatar
Super Guru

@PJ

These directories exists on journal nodes if that's what you are using or whatever disk you will specify in ambari for namenode when you do your install. I think you will find the following link helpful.

https://hortonworks.com/blog/hdfs-metadata-directories-explained/