Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Non localization by Resourse Manager in YARN


Non localization by Resourse Manager in YARN

New Contributor

In the definitive guide of Hadoop it is mentioned that:

"Sometimes the locality constraint cannot be met, in which case either no allocation is made or, optionally, the constraint can be loosened. For example, if a specific node was requested but it is not possible to start a container on it (because other containers are running on it), then YARN will try to start a container on a node in the same rack, or, if that’s not possible, on any node in the cluster."

I am not able to get this, because if my Application Manager needs a container on a particular node where my block resides and in case if the RM is not able to create a container on the particular node requested due to unavailability, it may create a container on another node according to the above sentence, how will it serve the purpose of localization?


Re: Non localization by Resourse Manager in YARN

Master Collaborator
The important part about being on the same rack is being on the same
network switch. If YARN can put your container on the right node, that's
the best option to really minimize network usage. If that node can't host
the container, the next best option is to use a machine on the same switch
so the data has as few hops as possible and you're only using one portion
of the network. So it's still as local to the data as it can be. And if
that also isn't possible, then it'll ignore locality and just get the job
done however it can.