Support Questions
Find answers, ask questions, and share your expertise

cloudera hadoop mapreduce job GC overhead limit exceeded error

Explorer

Hello, 

 

I would like some advice about the configuration I should adopt so I don't get anymore the GC overhead limit excess.

 

What parameter should I use for my child ulimit?

 

I am using Cloudera manager 4.8, CDH 4.6.

 

The namenode, and the job tracker are on a 8cpu 24Go RAM, and the datanode and thetasktracker are on 4cpu 12Go RAM for a data set of 2,8 Go of flume sequence files (2,1Mo each)

 

even when I try to make an ls on it i got the same error as the mapreduce :

 

[hdfs@evl2400460 root]$ hadoop fs -du -s -h /user/flume/tweets
2.8 G /user/flume/tweets
[hdfs@evl2400460 root]$ hadoop fs -ls /user/flume/tweets
ls: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: GC overhead limit exceeded.

 

Thanks for your help 🙂

 

--
Lefevre Kevin
1 REPLY 1

Re: cloudera hadoop mapreduce job GC overhead limit exceeded error

Master Guru
You need to raise the client JVM heap size. Since this issue may be a one-off, given the # of files under /user/flume/tweets perhaps, you can do so via the below:

~> HADOOP_CLIENT_OPTS="-Xmx4g" hadoop fs -ls /user/flume/tweets