Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to set the NameNode Heap memory in ambari?

Contributor

how much amount to be set heap memory size? It means heap memory size is depend on which factor?

1 ACCEPTED SOLUTION

@ANSARI FAHEEM AHMED

NameNode heap size depends on many factors such as the number of files, the number of blocks, and the load on the system. The settings in the referenced table below should work for typical Hadoop clusters where the number of blocks is very close to the number of files (generally the average ratio of number of blocks per file in a system is 1.1 to 1.2). Some clusters might require further tweaking of the following settings. Also, it is generally better to set the total Java heap to a higher value.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/ref-8095...

View solution in original post

6 REPLIES 6

@ANSARI FAHEEM AHMED

NameNode heap size depends on many factors such as the number of files, the number of blocks, and the load on the system. The settings in the referenced table below should work for typical Hadoop clusters where the number of blocks is very close to the number of files (generally the average ratio of number of blocks per file in a system is 1.1 to 1.2). Some clusters might require further tweaking of the following settings. Also, it is generally better to set the total Java heap to a higher value.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/ref-8095...

Contributor

@Bandaru: Thanks but how to set in ambari cluter ?

@ANSARI FAHEEM AHMED

HADOOP_HEAPSIZE sets the JVM heap size for all Hadoop project servers such as HDFS, YARN, and MapReduce. HADOOP_HEAPSIZE is an integer passed to the JVM as the maximum memory (Xmx) argument. For example:

HADOOP_HEAPSIZE=256

HADOOP_NAMENODE_OPTS is specific to the NameNode and sets all JVM flags, which must be specified.HADOOP_NAMENODE_OPTS overrides the HADOOP_HEAPSIZE Xmx value for the NameNode. For example:

HADOOP_NAMENODE_OPTS=SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=50m -XX:MaxNewSize=100m -XX:PermSize=128m -XX:MaxPermSize=256m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms250m -Xmx250m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT"

Both HADOOP_NAMENODE_OPTS and HADOOP_HEAPSIZE are stored in /etc/hadoop/conf/hadoop-env.sh.

Contributor

Thanks Sir Bandaru

There is a control in ambari under HDFS right at the top for the memory of the Namenode. You should not set the heap size for all components since most need much less memory than the namenode.

For Namenode a good rule of thumb is 1GB for 100TB of data in HDFS ( plus a couple GB base so 4-8 min ) but it needs to be tuned based on workload ( if you suspect your memory settings to be insufficient you can look at the memory and GC behaviour of your JVM )

Contributor

@Benjamin Thanks