Created 07-30-2016 06:56 PM
By default, hive's ullimit (managed through ambari) is 32000. We reached that limit last week on our hiveserver2 server and decided to increase this value in ambari to 64000. We just hit the 64k nofile ulimit. This leads me to believe that hiveserver2 is not cleaning up connections like it should and files are not being released.
Has anyone else experienced this issue? Any suggestions on what to check?
What do you have your hive user's nofile limit set too?
# lsof -u hive | wc -l
64450
We are on HDP 2.4.2, Ambari 2.2.2
Should hive really have that many files open?
Update:
We're approaching the 64k nofile ulimit setting again for the hive user.
# lsof -u hive | wc -l
57090
After digging through the output of lsof, I see a lot of temporary operation_logs.
/tmp/hive/operation_logs/658c3930-8975-47db-ad7f-7cbef6279b11/acc2043a-d3bb-4a8c-9a7d-d0b743b9ce5d
Here is the total number of open operation_logs files open right now.
# lsof -u hive | grep operation_logs | wc -l
56102
These files are 3 to 4 days old.
Created 07-30-2016 07:57 PM
Is this a secure cluster? You need to increase ulimits for root user. check this link.
Created 07-30-2016 11:21 PM
Sounds like a bug to me. Support ticket? There was an issue with ATS1.5 but it should definitely be fixed 2.4.2
Created 08-02-2016 06:48 PM
Thanks @Benjamin Leonhardi. That's what I am wondering as well. Something doesn't seem right. Can you point me to the ATS issue you are referring to?
Created 07-31-2016 01:05 PM
Hi @Jon Maestas. Executing the following should resolve the issue.
#Set file-max; no. of open files for single user
sudo sh -c 'echo "* soft nofile 200000" >> /etc/security/limits.conf'
sudo sh -c 'echo "* hard nofile 200000" >> /etc/security/limits.conf'
sudo sh -c 'echo "200000" >> /proc/sys/fs/file-max'
sudo sh -c 'echo "fs.file-max=65536" >> /etc/sysctl.conf'
#Set process-max
sudo sh -c 'echo "* soft nproc 8192" >> /etc/security/limits.conf'
sudo sh -c 'echo "* hard nproc 16384" >> /etc/security/limits.conf'
sudo sh -c 'echo "* soft nproc 16384" >> /etc/security/limits.d/90-nproc.conf'
# ULIMITS to be set
sudo sh -c 'echo ULIMITS adjustments'
sudo sh -c 'echo "hdfs - nofile 32768" >> /etc/security/limits.conf'
sudo sh -c 'echo "mapred - nofile 32768" >> /etc/security/limits.conf'
sudo sh -c 'echo "hbase - nofile 32768" >> /etc/security/limits.conf'
sudo sh -c 'echo "hdfs - nproc 32768" >> /etc/security/limits.conf'
sudo sh -c 'echo "mapred - nproc 32768" >> /etc/security/limits.conf'
sudo sh -c 'echo "hbase - nproc 32768" >> /etc/security/limits.conf'
#
Created 08-02-2016 06:47 PM
Thanks @Scott Shaw
We have found that the user's managed version of the limits conf (/etc/security/limits.d/hive.conf) is being used instead of the limits.conf file. Changing the hive.conf file manually gets overwritten once ambari sees that it has changed.
But the real question is, should hive really have 64k+ files open?
Created 08-02-2016 10:22 PM
How about checking netstat output as well?
For example, as root: netstat -aopen | grep -w <HS2 PID>
Created 08-04-2016 01:06 PM
Thanks @Hajime for the suggestion, netstat looks OK, not too many connections opened.
Created 08-05-2016 04:43 AM
In that case, how about setting session expirations if you haven't tried?
For example,
set hive.server2.idle.session.check.operation=true;
set hive.server2.session.check.interval=2h;
set hive.server2.idle.session.timeout=12h;
hive.server2.idle.operation.timeout=8h;
Created 08-15-2016 09:06 PM
After setting this property to false, we are no longer seeing extremely high ulimits for the hive user account.
hive.server2.logging.operation.enabled=false
Any idea what this is doing to consume so many open files?