Support Questions

Find answers, ask questions, and share your expertise

Hadoop nodes with different characteristics

avatar
Contributor

Good afternoon, I wanted to know how they could have a cluster with nodes of different characteristics in RAM, for example: slave1: 12 GB RAM

slave2: 12 GB RAM

Slave3: 32 GB RAM

Slave4: 32 GB RAM

The nodes would slave1 and slave2 a group and slave3 and slave4 nodes are another group. The problem is the time to set the parameter of Yarn "yarn.nodemanager.resource.memory-mb" and I do not know if I set it to the first group or the second group .. ??

1 ACCEPTED SOLUTION

avatar
Master Mentor

you need to leverage ambari config groups @Angel Chiluisa you have to create configuration groups for each hardware profile and then you can assign group specific properties. It is available in the ambari configuration page, look for configuration groups option. Here is a possible link for info. Here is official doc for this.

View solution in original post

9 REPLIES 9

avatar
Master Mentor

you need to leverage ambari config groups @Angel Chiluisa you have to create configuration groups for each hardware profile and then you can assign group specific properties. It is available in the ambari configuration page, look for configuration groups option. Here is a possible link for info. Here is official doc for this.

avatar
Master Mentor

you can toggle which group you assigned what to on the top there's default group then you can toggle the other group. You can also use the filter for the property and find all available configurations. Filter text box is in the right hand corner. @Angel Chiluisa it will show you both group configurations

avatar
Contributor

Excellent, thank you very much, I was looking for this configuration

avatar
Master Mentor

glad to help @Angel Chiluisa!

avatar
New Contributor

I have a similar question, if different workers have different storage capacities or even CPU speeds, what is the impact on the overall cluster? We have been advised to keep all our worker nodes the same spec, but we are not really sure why. How are we going to cope when we have a hardware refresh?

avatar
Expert Contributor

The primary design choice to make here is whether we need CPU scheduling (DRF) or not.

In some clusters with varying CPU capacities, due to throughput differences we may need to tweak time out settings to increase them. This is because Network socket time out occur in heterogeneous clusters.

Another aspect is to ensure that each node has enough memory head room left after Yarn allocation to prevent CPU hangs on less capable nodes. (Typically 80% for Yarn and 20% OS etc) . Since one of the nodes has only 12 GB of RAM, you may also want to closely monitor memory usage of processes especially Ambari agent and Ambari metrics memory usage and monitor if it is growing in size, that Yarn is not aware of.

avatar
Contributor

Hi,

we have the same in scenario for a CDH 5.10 cluster, with worker node with different cpu and memory.

Is there a similar solution for CDH distribution?

Thanks in advance.

avatar
Community Manager

@Ivoz As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Contributor

@DianaTorresThank you!