Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Yarn jobs not running on all node

Highlighted

Yarn jobs not running on all node

Hi

My system have 2 nodes of cluster. You can see image below.

9416-yarn.png

When I build cubes on Kylin. At the same time I have several jobs running on YARN. But when I check each job ID on Yarn ResourceManager UI , I only see all job running on first node(HDP02). Jobs only running on second node(HDP03) when first node use all RAM available (4GB).

I want to it balance resource with 2 nodes. I mean when jobs running, it will run on both two node to share resources, at the same time some jobs running on first node(HDP02) and some job running on second node(HDP03)

So have any configuration to make this balance between two node ?

Thanks

3 REPLIES 3

Re: Yarn jobs not running on all node

Super Collaborator

I can only guess, but possible the data which is used by job is located only on this datanode (not yet replicated or you have replication factor is 1). So, to reduce the network operations and use 'short-circuit', YARN utilizes the resources of this node first.

Highlighted

Re: Yarn jobs not running on all node

@ssoldatov

maybe my question not clear, I just updated

Highlighted

Re: Yarn jobs not running on all node

Expert Contributor

Unfortunately not. The resource allocation at this point of time does not consider the utilization of the individual NM's on the cluster and does not distributed the load evenly. These are the points discussed under the umbrella jira below:

https://issues.apache.org/jira/browse/YARN-2877

Don't have an account?
Coming from Hortonworks? Activate your account here