Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Yarn and submitting MapReduce jobs

avatar
New Contributor

Hi,

 

I used Cloudera Manager to setup a Cloudera Cluster in EC2 and have it setup to run Yarn instead of MRv1. However whenever I submit any jobs, the job gets accepted and is listed in the Resource Manager Web UI as accepted with final status undefined and tracking undefined. The job never starts I can find no errors whatsoever in any logs. Does anybody have an Idea why this could be? I suspect that the resource manager decides that there aren't the necessary resources available to run the job, jowever I have the tasks configured  to run with 512 MB and that is definitely available...

 

Thanks!

Mike

1 ACCEPTED SOLUTION

avatar
Mentor

@mikopp wrote:

I suspect that the resource manager decides that there aren't the necessary resources available to run the job, jowever I have the tasks configured  to run with 512 MB and that is definitely available...


Your suspicion is mostly true given the described symptoms. The way RM sees resources is not tied to the hardware, but is instead tied to how much an NM publishes per its configuration of yarn.nodemanager.resource.memory-mb.

 

Try to resubmit after lowering the values of the following properties (Defaults indicated):

  • mapreduce.map.memory.mb (1024 MB)
  • mapreduce.reduce.memory.mb (1024 MB)
  • yarn.app.mapreduce.am.resource.mb (1536 MB)

View solution in original post

9 REPLIES 9

avatar
Expert Contributor

avatar
Mentor

@mikopp wrote:

I suspect that the resource manager decides that there aren't the necessary resources available to run the job, jowever I have the tasks configured  to run with 512 MB and that is definitely available...


Your suspicion is mostly true given the described symptoms. The way RM sees resources is not tied to the hardware, but is instead tied to how much an NM publishes per its configuration of yarn.nodemanager.resource.memory-mb.

 

Try to resubmit after lowering the values of the following properties (Defaults indicated):

  • mapreduce.map.memory.mb (1024 MB)
  • mapreduce.reduce.memory.mb (1024 MB)
  • yarn.app.mapreduce.am.resource.mb (1536 MB)

avatar
New Contributor

Thanks Harsh,

 

That was the tip I needed. I had already changed the map and reduce memory (and pretty much everything else) but I didn't know about the yarn.app.mapreduce.am.resource.mb and for some reason that was messed up.

 

The ec2 setup of CFM put the 512 in my data nodes and 1.5 G in my node that ran all the master services. So the other way around. After giving the worker nodes 2.5 G and reducing the master to 512 it started running.

I used the CFM to set things up, so this might be a bug there.

 

Thanks again!

 

Best

Mike

avatar
Explorer

I am also facing the similar issue. Let me see if this solution works.

avatar
Explorer

I have successfully enabled the YARN framework in CDH 4.5 but not able to execute a MR appplication.
Please find the attached doc.

Am i missing something ?

I can easily do the same on Apache Hadoop 2.0 using numerous blogs and forums but not on highly customized CDH 4.5.

avatar
New Contributor

Lowering the params didn't work for me. Is there any other information I can verify?

 

-Shankar

avatar
New Contributor

Hello Guys,

I have created EC2 Cluster  and installed CDH 5(Just did as said in Cloudera website), but whenever i am starting my cloudera manager, the PU usuage is always showing as 100% . And none od Map reduce job are running, when i run the same, the job are not really running and stops at below step.

14/11/13 23:17:18 INFO mapreduce.Job: Running job: job_1415938229059_0001

 

When i am checking the Final Status, it says undefined, I beleive, it might be the issue with some configuration.

 

Any Pointers.

 

--Vijay

avatar
New Contributor

i am using m3.large...is its a Problem?

avatar
New Contributor

Fixed the same. Problem with the Yarn Setting. Set the default values via Dyanmic Resource Pools.