Support Questions

rburagohain · ‎11-03-2016

Hi Team,

I am using HDP 2.4.2 and I have 39 Datanodes in a cluster. Initially, there was 1GB heapsize of DN by default. Then it started to send warning alerts from every datanode even though there was no ingestion/job running. So, I increased the DN heapsize to 2GB but still it is sending me alerts consuming 60-70% heapsize and sometimes 80-90% even though cluster is idle. Is there any calculation/formula how much heapsize should I provide in Datanodes?? Please help.

Thanks,

Rahul

Report Inappropriate Content · ‎11-03-2016

@Rahul Buragohain If you believe that the 2GB heap is enough for your DataNode (and it's Idle most of the time still consuming that much memory frequently) then you should look at the DataNode GC log to findout if the GC was happening properly or not?

You might be hitting the following issue if your GC tuning is good. https://issues.apache.org/jira/browse/HDFS-11047

However i would suggest you to try with the following Datanode JVM options to see improvements.

-XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly  -XX:+UseConcMarkSweepGC

.

Currently you only get the formula to calculate the Heap Size of a NameNode, But not for the DataNode .

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.6/bk_installing_manually_book/content/ref-8095...

View solution in original post

Report Inappropriate Content · ‎11-03-2016

@Rahul Buragohain If you believe that the 2GB heap is enough for your DataNode (and it's Idle most of the time still consuming that much memory frequently) then you should look at the DataNode GC log to findout if the GC was happening properly or not?

You might be hitting the following issue if your GC tuning is good. https://issues.apache.org/jira/browse/HDFS-11047

However i would suggest you to try with the following Datanode JVM options to see improvements.

-XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly  -XX:+UseConcMarkSweepGC

.

Currently you only get the formula to calculate the Heap Size of a NameNode, But not for the DataNode .

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.6/bk_installing_manually_book/content/ref-8095...

rburagohain · ‎11-27-2016

@jss

Thanks a lot. That solved my issue and I am not getting DN heapsize alerts anymore.

pmj · ‎01-20-2017

@jss @Rahul Buragohain

I have the same issue with my HDP 2.4.2... where exactly do i change these parameters??

I see them in hadoop-env template with:

SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT" export HADOOP_NAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node\" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 ${HADOOP_NAMENODE_OPTS}" export HADOOP_DATANODE_OPTS="-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS}"

If this is the file, should i just add the mentioned parameters in the HADOOP_DATANODE_OPTS ??

and do i need to restart the hdfs service?

Thanks.

Cloudera Community

Support Questions

Datanode Heapsize Computation