there are lots of heap dump file created under the directory controlled by java flag -XX:HeapDumpPath, user complains that HiveServer2 has generated these dump since this flag was setup by him for HiveServer2.
looking at the HiveServer2 logs, there was no trace for startup and stop services. even there was no sign of any type of failure in the logs.
ROOT CAUSE:
At very preliminary analysis of heap dump I examine the process arguments which looks like this
the java process argument suggests that it is not hiveserver2, it looks like map side join spun by cliDriver.
we looked at a hive-env file which seems messed up, specially HADOOP_CLIENT_OPTS
if [ "$SERVICE" = "cli" ]; then
if [ -z "$DEBUG" ]; then
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
else
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive"
fi
fi
# The heap size of the jvm stared by hive shell script can be controlled via:
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive "
so after looking over we are sure that it was hive CLI process which spun Map Side join actually went into OOM not the Hiveserver2.
WORKAROUND:
NA
RESOLUTION:
Ask user to modify hive-env and set HADOOP_CLIENT_OPTS appropriately.