Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

HDFS Federation understanding

avatar
Champion Alumni

Hello,

 

I just tring to understand better the HDFS Federation. 

 

If I get it right:

- we should use it in order to split for example the real time space and the batch space.

- if we want to split the namespace into N namespaces than we have to have N namenodes

 

Thank you!

 

Alina

GHERMAN Alina
1 ACCEPTED SOLUTION

avatar
Mentor
Your understanding seems right, but note that none of the 'splitting' is automatic.

At its simplest form, federation is a way to have multiple distinct NameNodes powered by a common set of DataNodes.

Effectively, its running and managing 2 or more *separate* namespaces on top of the same storage space.

If you deploy two federated NameNodes, say hdfs://host-nn1/ and hdfs://host-nn2, then they will have nothing in common except the Live DN hostnames they share. A 'hadoop fs -ls' done on each will return absolutely independent results.

View solution in original post

1 REPLY 1

avatar
Mentor
Your understanding seems right, but note that none of the 'splitting' is automatic.

At its simplest form, federation is a way to have multiple distinct NameNodes powered by a common set of DataNodes.

Effectively, its running and managing 2 or more *separate* namespaces on top of the same storage space.

If you deploy two federated NameNodes, say hdfs://host-nn1/ and hdfs://host-nn2, then they will have nothing in common except the Live DN hostnames they share. A 'hadoop fs -ls' done on each will return absolutely independent results.