Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Super Guru

SYMPTOM:

there are lots of heap dump file created under the directory controlled by java flag -XX:HeapDumpPath, user complains that HiveServer2 has generated these dump since this flag was setup by him for HiveServer2.

looking at the HiveServer2 logs, there was no trace for startup and stop services. even there was no sign of any type of failure in the logs.

ROOT CAUSE:

At very preliminary analysis of heap dump I examine the process arguments which looks like this

hive     12345 1234  9.8 27937992 26020536 ?   Sl   Dec01 19887:12  \_ /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx24576m -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhdp.          version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library.   path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.           preferIPv4Stack=true -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security.logger=INFO,NullAppender -Dhdp.version=2.3.4.20-3 -Dhadoop.log.dir=/var/log/hadoop/hive -  Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.-258/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64- 64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.-258/hadoop/lib/native:/usr/hdp/2.4.2.-258/hadoop/lib/native/Linux-amd...      hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx24576m -Xmx24576m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive -Dhadoop.security.            logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.4.2.-258/hive/lib/hive-common-1.2.1.2.3.4.74-1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/tmp/hive/261a57a5- caab-4f98-9fa2-f50209ba29e9/*****/-local-10006/plan.xml -jobconffile file:/tmp/hive/K*****/-local-10007/jobconf.xml

the java process argument suggests that it is not hiveserver2, it looks like map side join spun by cliDriver.

we looked at a hive-env file which seems messed up, specially HADOOP_CLIENT_OPTS

if [ "$SERVICE" = "cli" ]; then
   if [ -z "$DEBUG" ]; then
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12  -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive  "
   else
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit   -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive"
   fi
 fi
# The heap size of the jvm stared by hive shell script can be controlled via:

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "

so after looking over we are sure that it was hive CLI process which spun Map Side join actually went into OOM not the Hiveserver2.

WORKAROUND:

NA

RESOLUTION:

Ask user to modify hive-env and set HADOOP_CLIENT_OPTS appropriately.

1,505 Views