Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS Federation

Highlighted

HDFS Federation

Hi Guys,

Can I configure the HDFS Federation using AMBARI ? If not how can I configure it into existing cluster which is created using ambari? I mean if possible through some properties or something with command line.

6 REPLIES 6
Highlighted

Re: HDFS Federation

Mentor

As far as i know it's not a supported feature in HDP

Here's the best resource for federation https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html

Similar questions have been asked https://community.hortonworks.com/questions/11010/hdfs-federation-and-viewfs-support-for-multiple-cl...

Highlighted

Re: HDFS Federation

Mentor

It is not supported here's latest document http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_HDP_RelNotes/content/community_features.h... clearly stating we do not support community driven features like federation

Highlighted

Re: HDFS Federation

Highlighted

Re: HDFS Federation

My head hurts thinking of running, at least, two NNs, 3+ JNs, 2 ZKFCs (assuming you could probably use the same ZK instances for each federated NN) for N number of federated members. Better buy a few more master nodes!

2684-ambari-meetup-namenode-ha-6-638.jpg

On the flip side, I'd always ask the "what are you trying to solve" question. Many times people are imaging security/visibility, but we already can do that with permissions (posix + ACL -- all administered with Ranger ideally). And space issues won't get resolved with Federation -- we still can leverage quotas.

Obviously I don't know the use case or requirement that might be driving a real need for Federation, but the cost is going to be high should we ever pull support for it into HDP and administer it from Ambari and I personally don't think it is worth the cost.

Highlighted

Re: HDFS Federation

@Lester Martin

can you please tell me that If I am using java application to request some file upload/download operations to HDFS, Where this request go first? is it zookeeper or namenode because If its namenode than I need to change address(respective namenode address) in my request URL every time when namenode active/passive status changes. So just wondering how could I use this effectively and reliably?

I just want to know for the sake of my knowledge as I am confused about whole architecture of HA.

It would be great if you provide some workflow type diagram or something.

Highlighted

Re: HDFS Federation

Yes, the "hdfs dfs -put" (or "hadoop fs -put") commands are running a Java application that itself used the Hadoop client libraries. Under the covers, this app is communicating with the NN for each block it needs to write to HDFS and is giving the specific DN names that the NN would like the replica copies to be stored to. Then (again, for each block) the client writes to these DN processes (in a pipelining fashion) to get the actual data for the block to be persisted to disk.

The diagram at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#NameNode_and_D... shows some of this interaction and companies like Hortonworks offer solid training to help on concepts like this. Additionally, I'm betting that http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1491901632/ has some detailed walkthrough as well.

Don't have an account?
Coming from Hortonworks? Activate your account here