Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Mapreduce job just hangs in CDH 5.3.4


Mapreduce job just hangs in CDH 5.3.4


I have a single box with CDH 5.3.4 installed and I'm trying to run a test mapreduce job to confirm that things are setup correctly.


$ sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.4.jar pi 2 2
Number of Maps  = 2
Samples per Map = 2
Wrote input for Map #0
Wrote input for Map #1
Starting Job
16/11/16 21:56:08 INFO client.RMProxy: Connecting to ResourceManager at
16/11/16 21:56:09 INFO input.FileInputFormat: Total input paths to process : 2
16/11/16 21:56:09 INFO mapreduce.JobSubmitter: number of splits:2
16/11/16 21:56:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1479336097366_0002
16/11/16 21:56:09 INFO impl.YarnClientImpl: Submitted application application_1479336097366_0002
16/11/16 21:56:09 INFO mapreduce.Job: The url to track the job:
16/11/16 21:56:09 INFO mapreduce.Job: Running job: job_1479336097366_0002

However this job just sits in PREP state forever.


$ mapred job -list
16/11/16 22:11:45 INFO client.RMProxy: Connecting to ResourceManager at
Total jobs:1
                  JobId	     State	     StartTime	    UserName	       Queue	  Priority	 UsedContainers	 RsvdContainers	 UsedMem	 RsvdMem	 NeededMem	   AM info
 job_1479336097366_0002	      PREP	 1479351369780	        hdfs	   root.hdfs	    NORMAL	              0	              0	      0M	      0M	        0M

I'm assuming that there is some issue with the amount of memory or some configuration which is blocking this job from running, but I can't nail down where the issue is. Can anyone provide some tips on how to debug this issue and resolve it?


Re: Mapreduce job just hangs in CDH 5.3.4

Super Collaborator

If you are starting with a cluster now I would strongly recommend that you use a CDH release much later than CDH 5.3. The later releases (CDH 5.8 or CDH 5.9) are far more stable than what you are trying to use now. Even if you stick with CDH 5.3 at least use the latest maintenance release.


Back to your question. There could be multiple things that cause your job to not start. First point to check would be the RM web UI and see what state the cluster and the scheduler is in. After that it depends on what the RM UI shows you...



Don't have an account?
Coming from Hortonworks? Activate your account here