Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

when yarn communicates with the namenodes when executing a job?

Solved Go to solution
Highlighted

when yarn communicates with the namenodes when executing a job?

Hi.

If I run a job with yarn that makes use of hdfs data, I understand that yarn will search for hardware resources to run it.

But how is the interaction of the yarn with the namenode.

In other words, the yarn has to communicate with the namenode at some point in order to know where are located the hdfs files that the job requires. When he does that?

Can someone please clarify the matter for me.

Regards

1 ACCEPTED SOLUTION

Accepted Solutions

Re: when yarn communicates with the namenodes when executing a job?

Hi @Lanic

When you submit a job, its YARN which gives an information about the resources. So the driver gets the information from name node regarding the HDFS data location, needed to execute the job. Then based on the nearest available resource which are closer to the data will be taken into consideration where the jobs will be executed. Its the name node which gives Yarn about the information of the HDFS data location. Once all the jobs are completed then the communication about all the jobs status will be updated and corresponding metastore will be brought in sync.

Hope it Helps!!

View solution in original post

2 REPLIES 2

Re: when yarn communicates with the namenodes when executing a job?

Hi @Lanic

When you submit a job, its YARN which gives an information about the resources. So the driver gets the information from name node regarding the HDFS data location, needed to execute the job. Then based on the nearest available resource which are closer to the data will be taken into consideration where the jobs will be executed. Its the name node which gives Yarn about the information of the HDFS data location. Once all the jobs are completed then the communication about all the jobs status will be updated and corresponding metastore will be brought in sync.

Hope it Helps!!

View solution in original post

Highlighted

Re: when yarn communicates with the namenodes when executing a job?

Just to be more specific

1. Driver talks to namenode to find the location of the HDFS blocks.
2. The info is available to the AM.
3. Driver request for AM, Am requests for the required resources based on the blocks info.
4. YARN has no business to talk to namenode directly.

Don't have an account?
Coming from Hortonworks? Activate your account here