Support Questions

Find answers, ask questions, and share your expertise

HDFS Federation understanding

avatar
Champion Alumni

Hello,

 

I just tring to understand better the HDFS Federation. 

 

If I get it right:

- we should use it in order to split for example the real time space and the batch space.

- if we want to split the namespace into N namespaces than we have to have N namenodes

 

Thank you!

 

Alina

GHERMAN Alina
1 ACCEPTED SOLUTION

avatar
Mentor
Your understanding seems right, but note that none of the 'splitting' is automatic.

At its simplest form, federation is a way to have multiple distinct NameNodes powered by a common set of DataNodes.

Effectively, its running and managing 2 or more *separate* namespaces on top of the same storage space.

If you deploy two federated NameNodes, say hdfs://host-nn1/ and hdfs://host-nn2, then they will have nothing in common except the Live DN hostnames they share. A 'hadoop fs -ls' done on each will return absolutely independent results.

View solution in original post

1 REPLY 1

avatar
Mentor
Your understanding seems right, but note that none of the 'splitting' is automatic.

At its simplest form, federation is a way to have multiple distinct NameNodes powered by a common set of DataNodes.

Effectively, its running and managing 2 or more *separate* namespaces on top of the same storage space.

If you deploy two federated NameNodes, say hdfs://host-nn1/ and hdfs://host-nn2, then they will have nothing in common except the Live DN hostnames they share. A 'hadoop fs -ls' done on each will return absolutely independent results.