Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Calculate File Descriptor in HBase

SOLVED Go to solution

Calculate File Descriptor in HBase

Contributor

Hello All,

 

I am looking for some best practices or recommendations to set a best possible value for rlimit_fds (Maximum Process File Descriptors) property. Currently, it is set to default i.e. 32768 and we are getting File Descriptor Threshold alerts.

 

We would first like to look for a best possible value for rlimit_fds. Is there a formulae or a practice or few checks that can be performed to determine a best value?

 

Thanks

snm1523

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Calculate File Descriptor in HBase

Master Guru
You'll need to use lsof with a pid specifier (lsof -p PID). The PID must be your target RegionServer's java process (find via 'ps aux | grep REGIONSERVER' or similar).

In the output, you should be able to classify the items as network (sockets) / filesystem (files) / etc., and the interest would be in whatever holds the highest share. For ex. if you see a lot more sockets hanging around, check their state (CLOSE_WAIT, etc.). Or if it is local filesystem files, investigate if those files appear relevant.

If you can pastebin your lsof result somewhere, I can take a look.
4 REPLIES 4

Re: Calculate File Descriptor in HBase

Master Guru
It is not normal to see the file descriptors limit run out or run close to limit unless you have an overload problem of some form. I'd recommend checking via 'lsof' what is the major contributor towards the FD count for your RegionServer process - chances are it is avoidable (a bug, a flawed client, etc.).

The number should be proportional to your total region store file counts and the number of connecting clients. While the article at https://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/ focuses on DN data transceiver threads in particular, the formulae at the end can be applied similarly to file descriptors in general too.

Re: Calculate File Descriptor in HBase

Contributor
Thank you for the reply Harsh J.

Would you be able to please help me with any quick command / script to identify avoidable open files or files stuck in some process using 'lsof' and guide further actions to take?

I tried running a generic 'lsof | grep java' but it obviously gave me a huge list of files and became a bit difficult to get relevant information.

Thanks
snm1523

Re: Calculate File Descriptor in HBase

Master Guru
You'll need to use lsof with a pid specifier (lsof -p PID). The PID must be your target RegionServer's java process (find via 'ps aux | grep REGIONSERVER' or similar).

In the output, you should be able to classify the items as network (sockets) / filesystem (files) / etc., and the interest would be in whatever holds the highest share. For ex. if you see a lot more sockets hanging around, check their state (CLOSE_WAIT, etc.). Or if it is local filesystem files, investigate if those files appear relevant.

If you can pastebin your lsof result somewhere, I can take a look.

Re: Calculate File Descriptor in HBase

Contributor
Hello Harsh,

Thank you for the help on this.

I was able to identify some information that helped here. Will come back in case need further help.

Will accept your reply as Solution. :)

Thanks
snm1523