Reply
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Unbalanced tasks between hadoop servers

Hi,

 

I have 2 types of servers in my hadoop cluster, one with 64 G and the other with 128G, i created 2 templates and for the servers with 64 G i defined the container memory to 58G and for the stronger with 120 G.

 

I faced an issue that the tasks distributed on the nodes in a syemtric way that the servers with container memory of 58G reached 99% and alert on my system while the stronger still with 50% usage.

 

Appreciate any help.

Posts: 630
Topics: 3
Kudos: 102
Solutions: 66
Registered: ‎08-16-2016

Re: Unbalanced tasks between hadoop servers

This is a two-fold issue:

First, by default YARN will assign and schedule based off of resource availability. This is not based on how much is remaining but simply does it have the resources to run a container.

Second, it will always try for data locality to the Node, then to the Rack, and then to the Cluster. You can tweak how this behaves if using the Fair Scheduler. You can give less weight (by setting a short time frame to achieve Node locality) to Node locality. This will make it more likely to be placed in another Rack. This does not guarantee that the larger nodes will get used more.

I haven't messed with YARN labels, but you could try this as a way to force the usage of the larger nodes. At the base of it, you assign labels and then tell the application to use said label. So if you partition the cluster into Large and Small nodes, you could then manually assign where apps run to make it more likely that the larger nodes use up more than 50% before the Small nodes reach capacity. This feels like an operational pain.

In short, I don't recommend using heterogenous hardware. Even when it is supported (think HDFS storage policies) it isn't the easiest feature to implement and operate.
Posts: 630
Topics: 3
Kudos: 102
Solutions: 66
Registered: ‎08-16-2016

Re: Unbalanced tasks between hadoop servers

Lets talk about the risks and challenges of operating like this. I would image that monitoring would be a pain as you could trigger alarms the same across all nodes. This is also the risk of burn out since the Small nodes will be running at capacity nearly all of the time.
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: Unbalanced tasks between hadoop servers

Weird behaviour for the Resource Manager ...

 

It's expected behaviour that i need to increase my cluster by the time and by the time there is a stronger servers in the market that i want to get in the same price of the old, it's doesn't make sense to change all the cluster servers when i want stronger servers to be in the cluster, as the HDFS has the storage polict by DataNode where the largest node store more data and not round robin.

 

In the other hand it's doesn't make sense to redue the container memory to a lower level to avoid the monitoring and loose un used memory.

Posts: 630
Topics: 3
Kudos: 102
Solutions: 66
Registered: ‎08-16-2016

Re: Unbalanced tasks between hadoop servers

This is the design of Hadoop (HDFS and YARN). It loves to go small and wide. YARN did add the benefit of going deeper on memory. This is part of the pattern for most clustered systems; not all but most. I have done my fair share of migrations to new hardware for clusters because of this design pattern.
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: Unbalanced tasks between hadoop servers

What i think now to increase the memory for the smaller servers as they have the same cores as the large ones.

 

Hope our system team can do that and to have such option.

Champion
Posts: 543
Registered: ‎05-16-2016

Re: Unbalanced tasks between hadoop servers

Curiuos to know the type of task that you are firing ? 

spark , mapreduce jobs .

Highlighted
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: Unbalanced tasks between hadoop servers

Spark
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: Unbalanced tasks between hadoop servers

For now, i reduced the memory container on the small servers from 58 G to 50 G and drop impala role from these nodes.

 

@mbigelow i'm planning to align all the hadoop servers in my clusters, 

 

One of the issues i ran on also that one small server had a template of the larger servers which impacted the server memory.

 

Looking forward to add monitoring to catch such cases.

Announcements