Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Expert Contributor

By default, hiveserver2 doesn't have an entry in the hive-env.sh that provides a way to change the type of GC. The CMS collector is designed to eliminate the long pauses associated with the full GC cycles of the throughput and serial collectors. CMS stops all application threads during a minor GC, which it also performs with multiple threads.

In order to change the type of GC:

Ambari > Hive > Configs > Advanced hive-env > hive-env template > Add the follow properties.

  if [ "$SERVICE" = "hiveserver2" ]; then
   if [ -z "$DEBUG" ]; then
     export HADOOP_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hive/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m -Xmx1024m"
     else
     export HADOOP_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hive/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms1024m024m -Xmx1024m"
   fi
 fi

Make sure that the new settings has been applied successfully, as well as the heap sizes.

/usr/jdk64/jdk1.8.0_112/bin/jcmd <hiveserver_pid> VM.flags
/usr/jdk64/jdk1.8.0_112/bin/jmap -heap <hiveserver_pid>

OBS: This article is not meant to provide the best JVM flags, this will vary according to your environment. The idea is to always scale out the load avg adding more HS2 instances, in case your HS2 are highly utilized. Please, check with a HWX consultant to better align it.

2,071 Views