Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Co-locate Namenode and JournalNode on the same hosts?

Solved Go to solution

Co-locate Namenode and JournalNode on the same hosts?

New Contributor

I have a cluster with over 300 nodes. We have HA NN and three JN. I've always had the NN and JN on different hosts but recently got a suggestion to co-locate them, or at least co-locate two JN with NN, and the other JN on an unrelated host. I have co-located them in small clusters in the past, and in our dev environment they are co-located. What's the best practice?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Co-locate Namenode and JournalNode on the same hosts?

The journal nodes can be quite IO intensive, while the Namenode is generally more memory and CPU intensive. So one could justify co-locating them. BUT, when it comes to checkpointing, they could conflict. More importantly, delays in writing for the journal node will impact the namenode and result in higher RPC Queue times.

With a cluster that size, I would always want to run the namenode by itself. It's far too important to compromise it by co-locating it with another highly active service.

And regarding the Journal Node, don't store the journal directories on an LVM that's shared with the OS. Again, the Journal Node is quite IO intensive. And I've seen it project slowness back to the Namenode (in RPC queue times) when they are competing with the OS because they are sharing the same physical disks.

2 REPLIES 2
Highlighted

Re: Co-locate Namenode and JournalNode on the same hosts?

The journal nodes can be quite IO intensive, while the Namenode is generally more memory and CPU intensive. So one could justify co-locating them. BUT, when it comes to checkpointing, they could conflict. More importantly, delays in writing for the journal node will impact the namenode and result in higher RPC Queue times.

With a cluster that size, I would always want to run the namenode by itself. It's far too important to compromise it by co-locating it with another highly active service.

And regarding the Journal Node, don't store the journal directories on an LVM that's shared with the OS. Again, the Journal Node is quite IO intensive. And I've seen it project slowness back to the Namenode (in RPC queue times) when they are competing with the OS because they are sharing the same physical disks.

Re: Co-locate Namenode and JournalNode on the same hosts?

Rising Star

> or at least co-locate two JN with NN, and the other JN on an unrelated host.

That is definitely bad advice. We have 3 JNs so that we get high availability by writing to a majority. However if we keep 2 JNs in the same machine, then you will lose both at the same time. That is what we want to avoid in the first place.