Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I am getting "Too many open files" error in nifi log. Could somebody explain me when nifi needs to open files and when nifi closes them?

avatar
Contributor

I have developed custom processor which will consume zmq data produced by source. I am using nifi to forward those received zmq data to remote machine using RemoteProcessGroup (site to site setup). When large logs is generated on source, I am getting "Too many open files" error in nifi log. As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could somebody explain me when nifi needs to open files and when nifi closes them so that I can optimize my implementation?

5 REPLIES 5

avatar
Rising Star

May be the the UNIX param needs to be set accordingly. Set it to unlimited or to a desired number

ulimit -u unlimited

avatar
Contributor

Thank you @J Koppole. My co-worker do not want to increase ulimit more than 90000, (and unlimited) so I need an alternative. According to my co-worker making ulimit unlimited will make security issue and if any other processes can open unlimited files, it will consume whole resources quickly. Is it still safe to make ulimit unlimited.
My actual question was:

As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could you explain me when nifi needs to open files and when nifi closes them; so that I can optimize my implementation?

avatar
Expert Contributor

@Prabin Silwal

in Linux every process creates a file descriptor entry into your process table for every opened files or Input/Output. So, for every user in a linux/unix system has certain limit for open file descriptors which is normally set as 10ex24.

In case if a user process tries to exceed the defined FD's it fails with this type of error.

To mitigate this issue you can increase the FD for the user executing NIFI process on your OS. To achieve this you have to use "ulimit" command.

To check open FD for a user:

Login with the account with which your NIFI process normally runs.

Run the below mentioned command to check the limit:

# To check the FD limit for specific user only
ulimit -n
#another way of checking the limit
#This is system wide
$ cat /proc/sys/fs/file-max

Or you may do this to check the number of open FD currently by the process, use below steps:

  1. Find out the process id on linux machine for NIFI process. You may use "ps -eaf and grep {Pattern}" and the cmd for grep pattern. Once you have the process id, run below command to take the count of open FD's
#This is for specific user only
$ cd /proc/{Process_ID}/fd;ls -l|wc -l

#To find out how many of the available file descriptors are being currently used, run the following command:
#This is system wide
$ cat /proc/sys/fs/file-nr

So, using above to methods you may check the limit and currently open FD by specific process. Once you have the count during the busiest run you will come to know the limit to which you must adjust your FD for that process.

How do I change FD for user:

# Open limits.conf file and make below change
$ vi /etc/security/limits.conf

## Example hard limit for max opened files
# Example line is below for nifi_user where we are setting hard and soft limit to 50000 for max opened files
nifi_user     hard nofile 50000
nifi_user     soft nofile 50000

avatar
Contributor

@Sandeep Kumar
Thank you Sandeep for your solution. Actually, it was not my actual question.

Previously, my nifi getting problem when getting 1lakh flowfile in queue and I increase ulimit from 1024 to 90000; But still I am getting max open file errors when I am having 10 lakh flowfiles in queue. My co-worker do not want to increase ulimit more than this, so I need alternative.

I appreciate your solution but I need to know when nifi needs to open files and how to reduce it.
My actual question:

As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could somebody explain me when nifi needs to open files and when nifi closes them so that I can optimize my implementation?

avatar
Explorer

Take a backup of these repositry in some location and then delete these repository then try to clear the data stuck in queue in nifi flow and then restart the nifi.It will take more time than usual to restart. After restart,these repositry will be automatically restored.

1.flowfile_repository

2.provenance_repository

3.database_repository

4.content_repository