I would like some advice about the configuration I should adopt so I don't get anymore the GC overhead limit excess.
What parameter should I use for my child ulimit?
I am using Cloudera manager 4.8, CDH 4.6.
The namenode, and the job tracker are on a 8cpu 24Go RAM, and the datanode and thetasktracker are on 4cpu 12Go RAM for a data set of 2,8 Go of flume sequence files (2,1Mo each)
even when I try to make an ls on it i got the same error as the mapreduce :
[hdfs@evl2400460 root]$ hadoop fs -du -s -h /user/flume/tweets2.8 G /user/flume/tweets[hdfs@evl2400460 root]$ hadoop fs -ls /user/flume/tweetsls: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: GC overhead limit exceeded.
Thanks for your help 🙂