Support Questions

xavwebmaster · ‎02-18-2018

Hi.

If I run a job with yarn that makes use of hdfs data, I understand that yarn will search for hardware resources to run it.

But how is the interaction of the yarn with the namenode.

In other words, the yarn has to communicate with the namenode at some point in order to know where are located the hdfs files that the job requires. When he does that?

Can someone please clarify the matter for me.

Regards

balavignesh_nag · ‎02-19-2018

Hi @Lanic

When you submit a job, its YARN which gives an information about the resources. So the driver gets the information from name node regarding the HDFS data location, needed to execute the job. Then based on the nearest available resource which are closer to the data will be taken into consideration where the jobs will be executed. Its the name node which gives Yarn about the information of the HDFS data location. Once all the jobs are completed then the communication about all the jobs status will be updated and corresponding metastore will be brought in sync.

Hope it Helps!!

View solution in original post

balavignesh_nag · ‎02-19-2018

Hi @Lanic

When you submit a job, its YARN which gives an information about the resources. So the driver gets the information from name node regarding the HDFS data location, needed to execute the job. Then based on the nearest available resource which are closer to the data will be taken into consideration where the jobs will be executed. Its the name node which gives Yarn about the information of the HDFS data location. Once all the jobs are completed then the communication about all the jobs status will be updated and corresponding metastore will be brought in sync.

Hope it Helps!!

kgautam · ‎02-19-2018

Just to be more specific

1. Driver talks to namenode to find the location of the HDFS blocks.
2. The info is available to the AM.
3. Driver request for AM, Am requests for the required resources based on the blocks info.
4. YARN has no business to talk to namenode directly.

Cloudera Community

Support Questions

when yarn communicates with the namenodes when executing a job?