Reply
Highlighted
New Contributor
Posts: 8
Registered: ‎12-08-2013

MR2 job was hanged.

I created a hive query in  hue, then a MR2 job running.

 

The job was consist of 60 maps and 1 reduce task.

 

But the job was hanged after 41 map task were finished.

 

following are some logs, any suggestions are appreciated:

 

2013-12-10 10:24:30,873 INFO [IPC Server handler 22 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1384943020267_0005_r_000000_0
2013-12-10 10:24:30,874 INFO [IPC Server handler 22 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1384943020267_0005_r_000000_0 is : 0.27222222
2013-12-10 10:24:30,982 INFO [IPC Server handler 23 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1384943020267_0005_r_000000_0. startIndex 49 maxEvents 10000
2013-12-10 10:24:31,986 INFO [IPC Server handler 24 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1384943020267_0005_r_000000_0. startIndex 49 maxEvents 10000
2013-12-10 10:24:32,990 INFO [IPC Server handler 25 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: MapCompletionEvents request from attempt_1384943020267_0005_r_000000_0. startIndex 49 maxEvents 10000
2013-12-10 10:24:33,912 INFO [IPC Server handler 26 on 37483] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1384943020267_0005_r_000000_0
Cloudera Employee
Posts: 723
Registered: ‎07-30-2013

Re: MR2 job was hanged.

Did you look at the stderr/stdout/syslog of the running task in JobBrowser? (there is no error in the above logs).

 

Does the query work with less data?

New Contributor
Posts: 8
Registered: ‎12-08-2013

Re: MR2 job was hanged.

It seems that job hanged on reduce task

 

here is logs:

 

put of size: 15578, inMemoryMapOutputs.size() -> 44, commitMemory -> 647272, usedMemory ->662850
2013-12-11 11:28:51,319 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#9 about to shuffle output of map attempt_1386731989012_0001_m_000044_0 decomp: 15169 len: 4834 to MEMORY
2013-12-11 11:28:51,338 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 15169 bytes from map-output for attempt_1386731989012_0001_m_000044_0
2013-12-11 11:28:51,338 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 15169, inMemoryMapOutputs.size() -> 45, commitMemory -> 662850, usedMemory ->678019
2013-12-11 11:28:51,339 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#9 about to shuffle output of map attempt_1386731989012_0001_m_000045_0 decomp: 14701 len: 4714 to MEMORY
2013-12-11 11:28:51,339 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 14701 bytes from map-output for attempt_1386731989012_0001_m_000045_0
2013-12-11 11:28:51,339 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 14701, inMemoryMapOutputs.size() -> 46, commitMemory -> 678019, usedMemory ->692720
2013-12-11 11:28:51,340 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#9 about to shuffle output of map attempt_1386731989012_0001_m_000046_0 decomp: 13291 len: 4365 to MEMORY
2013-12-11 11:28:51,340 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 13291 bytes from map-output for attempt_1386731989012_0001_m_000046_0
2013-12-11 11:28:51,341 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 13291, inMemoryMapOutputs.size() -> 47, commitMemory -> 692720, usedMemory ->706011
2013-12-11 11:28:51,341 INFO [fetcher#9] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: one-705:8080 freed by fetcher#9 in 81s
2013-12-11 11:28:51,342 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Assiging one-705:8080 with 2 to fetcher#10
2013-12-11 11:28:51,342 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: assigned 2 of 2 to one-705:8080 to fetcher#10
2013-12-11 11:28:51,347 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=8080/mapOutput?job=job_1386731989012_0001&reduce=0&map=attempt_1386731989012_0001_m_000047_0,attempt_1386731989012_0001_m_000048_0 sent hash and received reply
2013-12-11 11:28:51,349 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#10 about to shuffle output of map attempt_1386731989012_0001_m_000047_0 decomp: 15087 len: 4822 to MEMORY
2013-12-11 11:28:51,350 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 15087 bytes from map-output for attempt_1386731989012_0001_m_000047_0
2013-12-11 11:28:51,350 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 15087, inMemoryMapOutputs.size() -> 48, commitMemory -> 706011, usedMemory ->721098
2013-12-11 11:28:51,352 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#10 about to shuffle output of map attempt_1386731989012_0001_m_000048_0 decomp: 15049 len: 4851 to MEMORY
2013-12-11 11:28:51,352 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 15049 bytes from map-output for attempt_1386731989012_0001_m_000048_0
2013-12-11 11:28:51,352 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 15049, inMemoryMapOutputs.size() -> 49, commitMemory -> 721098, usedMemory ->736147
2013-12-11 11:28:51,353 INFO [fetcher#10] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: one-705:8080 freed by fetcher#10 in 11s
2013-12-11 11:30:03,335 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 4111 may have finished in the interim.

 

When I run a query only have map tasks, it worked properly.