Created on 04-09-2018 01:15 PM - edited 08-17-2019 07:44 AM
This article covers how to improve the performance of the NiFi UI.
Over time it has been seen that the users of NiFi have been building very large dataflows consisting of many thousands of components (processor, reporting tasks, controller services, etc). While NiFi in no way limits to any degree the number of components that can be added to the NiFi canvas, the more components a user adds, the less responsive the UI becomes. This processor explosion not only affects the responsiveness of the UI, but can also lead to unexpected node disconnections.
--- What are the various states a component can have?
NiFi components have multiple states that consist of stopped, started, enabled, and disabled. Beyond these states exists one of two statuses:
Valid: Component configuration was successfully validated. This means that all required properties have been configured and in the case of processors all required connections have been accounted for (connected to another component or terminated) and any referenced controller services have been enabled.
Invalid: Component configuration is not valid. This means that one or more required properties have not been configured and/or in the case of processors one or more connections have not been accounted for (connected to another component or terminated) and/or a referenced controller services have not been enabled.
--- Why does a processors state affect UI performance?
All processor components when added to the canvas are added in the "stopped" state. A user can then either start or disable that component manually.
All Controller Services and Reporting tasks added by a user are by default disabled. The user can then enable these components as needed.
NiFi regularly must validate these components to see if they are valid or invalid. While the validation of a few hundred to a thousand components adds up to very little time, the same does not hold true for NiFi instances consisting of thousands upon thousands of components.
User may have noticed a swirling on the right hand side of the NiFi status bar that seems to never go away. In a NiFi cluster, NiFi must retrieve the flow status from every node. It is possible for a component to be valid on one node but not another (for example, processor depends on local file that does not exist on all nodes). If this validation takes too long a node may be disconnected because the request took to long. Not to mention the UI does not update until these validations have completed.
--- What has NiFi done to make improvements here?
The bad news:
Prior to NiFi 1.1.0 there is nothing that can be done to improve performance here other then reducing the number of components you are using. This is because in versions of NiFi prior to components were validated in all four states.
The good news:
In NiFi 1.1.0 a change was made so that this validation only occurs on components that are in the "stopped" state and controller services or reporting tasks that are disabled. It is safe to assume that if a processor is running, it must be valid. It is also safe to assume that a Controller Service or a Reporting Task must be valid if it is enabled. Now that these "started" processors and "enabled" controller services or reporting tasks are no longer being validated, the UI performance will be much better.
--- What is the important to understand here?
It has also been observed that users add lots of components to the UI that are never started or are only started for short periods of time. If the number of "Stopped" processors is very high, validation is still going to take a considerable amount of time even in NiFi 1.10 or newer versions.
A quick look at the NiFi status bar above your canvas will show how many stopped components you have on your canvas:
To make sure the UI performance remains solid, it is important that users disable processors that are not in use on the canvas. You can use the "NiFi Summary" UI to to find stopped/invalid processors and disable them. Select the "PROCESSORS" tab and sort on the "Run Status" column. Clicking on the on the right hand side of row will take you directly to that processor.
Once a component is selected on the canvas it can be disabled or enabled via the "Operate" panel or by right clicking on processor and selecting Disable or Enable for displayed context menu.