Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

MR jobs failed on IO and java heap size

avatar
Master Collaborator
Hi,
 
I have a MR job running with 30 reducers, one of the reducers when it reached specific percentage failed with the below error:
 
I increased the reducer memory but with no success, i invetigating the data to find if specific key value has a lot of data and caused this.
 
Mu point i don't get from the below error if it IO issue which might  limited  connections between  nodes then i need to increase the ulimit or it's a memory issue and i increase to increase the memory more
 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/liveperson/hadoop/parcels/CDH-5.5.4-1.cdh5.5.4.p0.9/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/liveperson/data/server_hdfs/data/disk6/yarn/nm/usercache/lereports/appcache/application_1486847749225_242069/filecache/10/job.jar/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Halting due to Out Of Memory Error...
 
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "DataStreamer for file /liveperson/data/server_live-engage-mr/output/1490104329046-bi_contribution_xsess/_temporary/1/_temporary/attempt_1486847749225_242069_r_000001_0/RAWDATA-b_default-RPT_FA_CONTRIBUTION_XSESSION-r-00001 block BP-1370881566-172.16.144.147-1434971434689:blk_1283227391_209512600"
Mar 21, 2017 10:34:10 AM com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
at com.datastax.shaded.netty.buffer.HeapChannelBuffer.<init>(HeapChannelBuffer.java:42)
at com.datastax.shaded.netty.buffer.BigEndianHeapChannelBuffer.<init>(BigEndianHeapChannelBuffer.java:34)
at com.datastax.shaded.netty.buffer.ChannelBuffers.buffer(ChannelBuffers.java:134)
at com.datastax.shaded.netty.buffer.HeapChannelBufferFactory.getBuffer(HeapChannelBufferFactory.java:68)
at com.datastax.shaded.netty.buffer.AbstractChannelBufferFactory.getBuffer(AbstractChannelBufferFactory.java:48)
at com.datastax.shaded.netty.channel.socket.nio.NioWorker.read(NioWorker.java:80)
at com.datastax.shaded.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at com.datastax.shaded.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at com.datastax.shaded.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at com.datastax.shaded.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at com.datastax.shaded.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
 
Mar 21, 2017 11:02:11 AM com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
 
Mar 21, 2017 11:08:50 AM com.datastax.shaded.netty.util.HashedWheelTimer
WARNING: An exception was thrown by TimerTask.
java.lang.OutOfMemoryError: Java heap space
 
Halting due to Out Of Memory Error...
Mar 21, 2017 11:11:10 AM com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
 
Mar 21, 2017 11:27:11 AM com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector
WARNING: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
 
 
Log Type: stdout
Log Upload Time: Tue Mar 21 13:49:54 -0400 2017
Log Length: 0
 
Log Type: syslog
Log Upload Time: Tue Mar 21 13:49:54 -0400 2017
Log Length: 3997692
Showing 4096 bytes of 3997692 total. Click here for the full log.
enewer.renew(LeaseRenewer.java:423)
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:448)
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:304)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.protobuf.ServiceException: java.lang.RuntimeException: unexpected checked exception
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:244)
at com.sun.proxy.$Proxy13.renewLease(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:571)
... 12 more
Caused by: java.lang.RuntimeException: unexpected checked exception
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1059)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
... 14 more
Caused by: java.lang.OutOfMemoryError: Java heap space
2017-03-21 11:01:36,323 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.io.IOException: Failed on local exception: java.io.IOException: Couldn't set up IO streams; Host Details : local host is: "svpr-dhc024.lpdomain.com/172.16.144.172"; destination host is: "svpr-dhc012.lpdomain.com":45935; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1470)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:243)
at com.sun.proxy.$Proxy8.ping(Unknown Source)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:782)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't set up IO streams
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:790)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1519)
at org.apache.hadoop.ipc.Client.call(Client.java:1442)
... 5 more
Caused by: java.lang.OutOfMemoryError: Java heap space
 
2017-03-21 11:06:36,820 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.io.IOException: Failed on local exception: java.io.IOException: Couldn't set up IO streams; Host Details : local host is: "svpr-dhc024.lpdomain.com/172.16.144.172"; destination host is: "svpr-dhc012.lpdomain.com":45935; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1470)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:243)
at com.sun.proxy.$Proxy8.ping(Unknown Source)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:782)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Couldn't set up IO streams
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:790)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1519)
at org.apache.hadoop.ipc.Client.call(Client.java:1442)
... 5 more
Caused by: java.lang.OutOfMemoryError: Java heap space
 
2017-03-21 11:11:10,785 INFO [LeaseRenewer:lereports@VAProd] org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking renewLease of class ClientNamenodeProtocolTranslatorPB over svpr-mhc102.lpdomain.com/172.16.144.148:8020 after 1 fail over attempts. Trying to fail over after sleeping for 593ms.
2017-03-21 11:16:47,403 ERROR [LeaseRenewer:lereports@VAProd] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[LeaseRenewer:lereports@VAProd,5,main] threw an Error.  Shutting down now...
java.lang.OutOfMemoryError: Java heap space
2017-03-21 11:34:56,644 INFO [LeaseRenewer:lereports@VAProd] org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException
3 REPLIES 3

avatar
Expert Contributor

Hi @Fawze ,

 

Did you manage to solve this? We're also hitting the same issue.

 

Thanks,

Megh

avatar
Master Collaborator

Hi @vidanimegh if you are talking abou mapreduce job, you need to check if the job is failing in the map or reduce phase.

 

if the error is a java heap size, you need to increase me java heap size for teh mapper or reducer using  mapreduce.map.java.opts  or  mapreduce.reduce.java.opts 

avatar
Expert Contributor

Hi @Fawze ,

 

This is happening for ANALYZE TABLE commands, which I think are map only.

 

I've tried the heap space options you have mentioned and they're not helping.

 

Thanks,

Megh