Created 11-28-2017 04:10 AM
I have developed custom processor which will consume zmq data produced by source. I am using nifi to forward those received zmq data to remote machine using RemoteProcessGroup (site to site setup). When large logs is generated on source, I am getting "Too many open files" error in nifi log. As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could somebody explain me when nifi needs to open files and when nifi closes them so that I can optimize my implementation?
Created 11-29-2017 02:07 PM
May be the the UNIX param needs to be set accordingly. Set it to unlimited or to a desired number
ulimit -u unlimited
Created 12-01-2017 05:14 AM
Thank you @J Koppole. My co-worker do not want to increase ulimit more than 90000, (and unlimited) so I need an alternative. According to my co-worker making ulimit unlimited will make security issue and if any other processes can open unlimited files, it will consume whole resources quickly. Is it still safe to make ulimit unlimited.
My actual question was:
As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could you explain me when nifi needs to open files and when nifi closes them; so that I can optimize my implementation?
Created 11-29-2017 02:55 PM
in Linux every process creates a file descriptor entry into your process table for every opened files or Input/Output. So, for every user in a linux/unix system has certain limit for open file descriptors which is normally set as 10ex24.
In case if a user process tries to exceed the defined FD's it fails with this type of error.
To mitigate this issue you can increase the FD for the user executing NIFI process on your OS. To achieve this you have to use "ulimit" command.
To check open FD for a user:
Login with the account with which your NIFI process normally runs.
Run the below mentioned command to check the limit:
# To check the FD limit for specific user only ulimit -n #another way of checking the limit #This is system wide $ cat /proc/sys/fs/file-max
Or you may do this to check the number of open FD currently by the process, use below steps:
#This is for specific user only $ cd /proc/{Process_ID}/fd;ls -l|wc -l #To find out how many of the available file descriptors are being currently used, run the following command: #This is system wide $ cat /proc/sys/fs/file-nr
So, using above to methods you may check the limit and currently open FD by specific process. Once you have the count during the busiest run you will come to know the limit to which you must adjust your FD for that process.
How do I change FD for user:
# Open limits.conf file and make below change $ vi /etc/security/limits.conf ## Example hard limit for max opened files # Example line is below for nifi_user where we are setting hard and soft limit to 50000 for max opened files nifi_user hard nofile 50000 nifi_user soft nofile 50000
Created 12-01-2017 05:09 AM
@Sandeep Kumar
Thank you Sandeep for your solution. Actually, it was not my actual question.
Previously, my nifi getting problem when getting 1lakh flowfile in queue and I increase ulimit from 1024 to 90000; But still I am getting max open file errors when I am having 10 lakh flowfiles in queue. My co-worker do not want to increase ulimit more than this, so I need alternative.
I appreciate your solution but I need to know when nifi needs to open files and how to reduce it.
My actual question:
As I know, nifi will generate flowfile for each event message. Does nifi needs to open file for each generated flowfile? Could somebody explain me when nifi needs to open files and when nifi closes them so that I can optimize my implementation?
Created 10-16-2018 07:34 AM
Take a backup of these repositry in some location and then delete these repository then try to clear the data stuck in queue in nifi flow and then restart the nifi.It will take more time than usual to restart. After restart,these repositry will be automatically restored.
1.flowfile_repository
2.provenance_repository
3.database_repository
4.content_repository