Support Questions

Find answers, ask questions, and share your expertise

Calculate File Descriptor in HBase

avatar
Expert Contributor

Hello All,

 

I am looking for some best practices or recommendations to set a best possible value for rlimit_fds (Maximum Process File Descriptors) property. Currently, it is set to default i.e. 32768 and we are getting File Descriptor Threshold alerts.

 

We would first like to look for a best possible value for rlimit_fds. Is there a formulae or a practice or few checks that can be performed to determine a best value?

 

Thanks

snm1523

1 ACCEPTED SOLUTION

avatar
Mentor
You'll need to use lsof with a pid specifier (lsof -p PID). The PID must be your target RegionServer's java process (find via 'ps aux | grep REGIONSERVER' or similar).

In the output, you should be able to classify the items as network (sockets) / filesystem (files) / etc., and the interest would be in whatever holds the highest share. For ex. if you see a lot more sockets hanging around, check their state (CLOSE_WAIT, etc.). Or if it is local filesystem files, investigate if those files appear relevant.

If you can pastebin your lsof result somewhere, I can take a look.

View solution in original post

4 REPLIES 4

avatar
Mentor
It is not normal to see the file descriptors limit run out or run close to limit unless you have an overload problem of some form. I'd recommend checking via 'lsof' what is the major contributor towards the FD count for your RegionServer process - chances are it is avoidable (a bug, a flawed client, etc.).

The number should be proportional to your total region store file counts and the number of connecting clients. While the article at https://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/ focuses on DN data transceiver threads in particular, the formulae at the end can be applied similarly to file descriptors in general too.

avatar
Expert Contributor
Thank you for the reply Harsh J.

Would you be able to please help me with any quick command / script to identify avoidable open files or files stuck in some process using 'lsof' and guide further actions to take?

I tried running a generic 'lsof | grep java' but it obviously gave me a huge list of files and became a bit difficult to get relevant information.

Thanks
snm1523

avatar
Mentor
You'll need to use lsof with a pid specifier (lsof -p PID). The PID must be your target RegionServer's java process (find via 'ps aux | grep REGIONSERVER' or similar).

In the output, you should be able to classify the items as network (sockets) / filesystem (files) / etc., and the interest would be in whatever holds the highest share. For ex. if you see a lot more sockets hanging around, check their state (CLOSE_WAIT, etc.). Or if it is local filesystem files, investigate if those files appear relevant.

If you can pastebin your lsof result somewhere, I can take a look.

avatar
Expert Contributor
Hello Harsh,

Thank you for the help on this.

I was able to identify some information that helped here. Will come back in case need further help.

Will accept your reply as Solution. 🙂

Thanks
snm1523