Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Mapreducer job fails

Mapreducer job fails

Expert Contributor

What happens in the last phase of a map-reduce job ? i notice that for the failed jobs in my cluster they job fails at the last reducer with 4 retries. 

 

I have done a bit of troubleshooting , i see this job failure when mapreduce.input.fileinputformat.numinputfiles was 210  as compared to the sucessful one with 90. 

 

 

 

017-09-19 16:54:43,759 ERROR [main] com.sas.ci.acs.extract.CXAService: Found two load events with the same timestamp, discard this session c7beed7d082b5800baff686b 
2017-09-19 16:54:43,759 INFO [main] com.sas.ci.acs.extract.CXAService: No event to process in the specified hr and non_hr range for session: c7c05fbe822b5800c44dc447 
2017-09-19 17:00:05,412 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.lang.OutOfMemoryError: GC overhead limit exceeded 
at java.io.BufferedReader.<init>(BufferedReader.java:98) 
at java.io.BufferedReader.<init>(BufferedReader.java:109) 
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:541) 
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.updateProcessTree(ProcfsBasedProcessTree.java:223) 
at org.apache.hadoop.mapred.Task.updateResourceCounters(Task.java:898) 
at org.apache.hadoop.mapred.Task.updateCounters(Task.java:1067) 
at org.apache.hadoop.mapred.Task.access$500(Task.java:82) 
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:786) 
at java.lang.Thread.run(Thread.java:745) 

2017-09-19 17:00:33,982 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded 

 

2017-09-20 10:04:41,146 WARN [ResponseProcessor for block BP-71764089-10.239.121.82-1481226593627:blk_1083690667_9963612] org.apache.hadoop.hdfs.DFSClient: Slow ReadProcessor read fields took 31887ms (threshold=30000ms); ack: seqno: 1 reply: 0 reply: 0 reply: 0 downstreamAckTimeNanos: 883371, targets: [DatanodeInfoWithStorage[10.239.121.39:50010,DS-36b13b93-0fc6-4b25-b364-00d8a5396498,DISK], DatanodeInfoWithStorage[10.239.121.209:50010,DS-17f67f74-6178-458f-a482-9d016aeb15c1,DISK], DatanodeInfoWithStorage[10.239.121.56:50010,DS-0ca06f40-2712-4f3c-ad07-da714a55084c,DISK]]
2017-09-20 10:04:41,231 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.util.Arrays.copyOf(Arrays.java:2219)
	at java.util.ArrayList.grow(ArrayList.java:242)
	at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216)
	at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208)
	at java.util.ArrayList.add(ArrayList.java:440)
	at java.lang.String.split(String.java:2288)
	at java.lang.String.split(String.java:2355)
	at com.sas.ci.acs.extract.CXAService$myReduce.parseEvent(CXAService.java:1596)
	at com.sas.ci.acs.extract.CXAService$myReduce.reduce(CXAService.java:919)
	at com.sas.ci.acs.extract.CXAService$myReduce.reduce(CXAService.java:237)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2017-09-20 10:04:41,332 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system...
2017-09-20 10:04:41,333 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped.
2017-09-20 10:04:41,333 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete.

Any reason why the last reducer fails ?

 

Should we need to modify these properties ? as mapred.child.java.opts is deprecated. 

 

mapreduce.map.java.opts-Djava.net.preferIPv4Stack=true -Xmx3865051136
mapreduce.reduce.java.opts
-Djava.net.preferIPv4Stack=true -Xmx6144067296
1 REPLY 1

Re: Mapreducer job fails

Expert Contributor

This was a data skew issue and the issue was resolved after increasing the mapreduce.reduce.java.opts and mapreduce.reduce.memory.mb

Don't have an account?
Coming from Hortonworks? Activate your account here