About MattWho

MattWho · ‎04-12-2018

@Bharadwaj Bhimavarapu Processors within the body of a dataflow should NOT be configured to use the "Primary node" only "execution" strategy. The only processors that should be scheduled to run on "Primary node" only would be data ingest type processors that do not use cluster friendly type protocols. The most common non-cluster friendly ingest processors can be found to have "List<type>" processor names (ListSFTP, ListHDFS, ListFTP, ListFile, ....). - When a node is no longer elected as the primary node, it will stop scheduling only those processors set for "Primary node" only execution. All other processors will continue to execute. The newly elected primary node will begin executing its "Primary node" only scheduled processors. These processors generally are designed to record some cluster wide state on where previous primary node execution left off so the same processor executing on the new primary node picks up where other left off. - This is why it is important that any processor that takes a incoming connection from another processor is not scheduled for "Primary node" only execution. If primary node changes you still want original primary node to continue processing the data queued downstream of the "primary node" only ingest processor. - There is no way to specify a specific node in a NiFi cluster to be the primary node. It is important to make sure that any one of your nodes is capable of executing the primary node processors at any time. - Zookeeper is responsible for electing both the primary node and cluster coordinator in a NiFi cluster. If your GC cycles are affecting the ability of your nodes to communicate with ZK in a timely manor, this may explain the constant election changes by ZK in your cluster. My suggestion would be to adjust the ZK timeouts in NiFi here (defaults are only 3 secs which is far from ideal in a production environment). The following properties can be found in the nifi.properties file: nifi.zookeeper.session.timeout=60 secs nifi.zookeeper.connect.timeout=60 secs *** If using Ambari to mange your HDF cluster, make the above changes via nifi configs in Ambari. - Thanks, Matt - If you found this answer addressed you initial question, please take a moment to login and click "accept" on the answer.

MattWho · ‎04-11-2018

@Zack Atkinson Make sure that every node in your NiFi Cluster can resolve the hostnames for every other node in your NiFi cluster. Make sure that all NiFi nodes can resolve and reach the configured zookeeper servers. Make sure the following properties are set and their are no typos (including leading or trailing whitespaces) in the nifi.properties file: nifi.zookeeper.connect.string <-- should be set to resolvable hostnames for zookeeper servers nifi.web.https.host or nifi.web.http.host <-- should be set to resolvable hostname for server nifi.cluster.is.node <-- should be set to resolvable hostname for server What is seen in the nifi-app.log around timeframe issue occurs? Is there a full stack trace with this error? Thanks, Matt

MattWho · ‎04-11-2018

@Rahoul A Progress of defects will not be tracked/updated in this forum. You can track the progress of this Jira through the Jira link provided above. Can you close this thread by clicking "accept" on an answer? Thank you, Matt

MattWho · ‎04-11-2018

@Rahoul A Forum tip: Try to avoid responding to existing "Answers" by creating a new "Answer". To keep threads easy to read, please use comments. As far as using Process Group level defined variables, you are correct that it is not working as intended. I have created the following Jira to get this addressed: https://issues.apache.org/jira/browse/NIFI-5073 The NiFi registry File does work however here. Thank you, Matt If you found this answer addressed your question, please take a moment to click "accept" to close this thread.

MattWho · ‎04-11-2018

@Rahoul A Where have the key/value pairs for broker_uri,client_libs, and con_factory been defined? Controller Services operate independent of processors and thus cannot rely on attributes defined on FlowFiles. The attribute values must be available to the controller service at all times. The most typical place such attributes (key/value pairs) are in a nifi registry file. # external properties files for variable registry # supports a comma delimited list of file locations nifi.variable.registry.properties= set this property to point to a file you created that contains 1 to many key/value pairs. Then create this file and make sure it is owned and accessible by the user that runs NiFi. The content so f this file may look something like this: broker_uri=tcp://localhost:4141 client_libs=/NiFi/custom-lib-dir/MQlib con_factory=blah Make sure there is only one entry per line and there are no leading or trailing whitespaces on each line. Then restart your NiFi. The controller service should now appear valid as long as each value above is valid (i.e - defined lob path is valid and exists). Attributes can also come from NiFi JVM properties defined in bootstrap.conf file and system environment variables defined for user running NiFi. Thank you, Matt

MattWho · ‎04-10-2018

@sri chaturvedi Thank you for your feedback. Unfortunately, the "Disable" and "Enable" buttons are not available when multiple components are selected. I filled a JIra for such an improvement (https://issues.apache.org/jira/browse/NIFI-5066 ) For now, when dealing with a flow with such a large number of stopped components, it may be easier to simply manually edit the NiFi flow.xml.gz file. What you want to look for are all entries containing the following string: <scheduledState>STOPPED</scheduledState> and replace that with: <scheduledState>DISABLED</scheduledState> My suggestion would be to make a copy of the flow.xml.gz file. Edit the copy as described above. Stop your NiFi instance/cluster. Then switch out the original flow.xml.gz with the new modified copy of the flow.xml.gz on all NiFi instances. Make sure file ownership is correct and restart NiFi. Thank you, Matt

MattWho · ‎04-09-2018

@sri chaturvedi I have captured above in a new article: https://community.hortonworks.com/articles/184786/hdfnifi-improving-the-performance-of-your-ui.html

MattWho · ‎04-09-2018

Short Description: This article covers how to improve the performance of the NiFi UI. Article: Over time it has been seen that the users of NiFi have been building very large dataflows consisting of many thousands of components (processor, reporting tasks, controller services, etc). While NiFi in no way limits to any degree the number of components that can be added to the NiFi canvas, the more components a user adds, the less responsive the UI becomes. This processor explosion not only affects the responsiveness of the UI, but can also lead to unexpected node disconnections. --- What are the various states a component can have? NiFi components have multiple states that consist of stopped, started, enabled, and disabled. Beyond these states exists one of two statuses: Valid: Component configuration was successfully validated. This means that all required properties have been configured and in the case of processors all required connections have been accounted for (connected to another component or terminated) and any referenced controller services have been enabled. Invalid: Component configuration is not valid. This means that one or more required properties have not been configured and/or in the case of processors one or more connections have not been accounted for (connected to another component or terminated) and/or a referenced controller services have not been enabled. --- Why does a processors state affect UI performance? All processor components when added to the canvas are added in the "stopped" state. A user can then either start or disable that component manually. All Controller Services and Reporting tasks added by a user are by default disabled. The user can then enable these components as needed. NiFi regularly must validate these components to see if they are valid or invalid. While the validation of a few hundred to a thousand components adds up to very little time, the same does not hold true for NiFi instances consisting of thousands upon thousands of components. User may have noticed a swirling on the right hand side of the NiFi status bar that seems to never go away. In a NiFi cluster, NiFi must retrieve the flow status from every node. It is possible for a component to be valid on one node but not another (for example, processor depends on local file that does not exist on all nodes). If this validation takes too long a node may be disconnected because the request took to long. Not to mention the UI does not update until these validations have completed. --- What has NiFi done to make improvements here? The bad news: Prior to NiFi 1.1.0 there is nothing that can be done to improve performance here other then reducing the number of components you are using. This is because in versions of NiFi prior to components were validated in all four states. The good news: In NiFi 1.1.0 a change was made so that this validation only occurs on components that are in the "stopped" state and controller services or reporting tasks that are disabled. It is safe to assume that if a processor is running, it must be valid. It is also safe to assume that a Controller Service or a Reporting Task must be valid if it is enabled. Now that these "started" processors and "enabled" controller services or reporting tasks are no longer being validated, the UI performance will be much better. https://issues.apache.org/jira/browse/NIFI-2996 --- What is the important to understand here? It has also been observed that users add lots of components to the UI that are never started or are only started for short periods of time. If the number of "Stopped" processors is very high, validation is still going to take a considerable amount of time even in NiFi 1.10 or newer versions. A quick look at the NiFi status bar above your canvas will show how many stopped components you have on your canvas: To make sure the UI performance remains solid, it is important that users disable processors that are not in use on the canvas. You can use the "NiFi Summary" UI to to find stopped/invalid processors and disable them. Select the "PROCESSORS" tab and sort on the "Run Status" column. Clicking on the on the right hand side of row will take you directly to that processor. Once a component is selected on the canvas it can be disabled or enabled via the "Operate" panel or by right clicking on processor and selecting Disable or Enable for displayed context menu.

MattWho · ‎04-09-2018

@sri chaturvedi The large number of stopped processors on your canvas is definitely going to affect the UI performance. NiFi will validate these stopped processors to determine if they are "valid" or "invalid". Considering you have so many stopped processors, this validation will take considerable time and have a considerable affect on UI performance and may even result in nodes disconnecting from NiFi. - There is some good news... You are running NiFi 1.1.0. Prior to NIFi 1.1.0, NiFi validated components in all states (Stopped, Started, and disabled). In NiFi 1.1.0 and newer, NiFi only validates "stopped" processors. This means you can improve your NiFi UI performance by simply disabling stopped processors you are not using. - You can find more details on this in https://issues.apache.org/jira/browse/NIFI-2996 - Thank you, Matt *** If you found this answer addressed your question, please take a moment to login and click "accept".

MattWho · ‎04-06-2018

@Anishkumar Valsalam I honestly do not know how to extract that info from NiFi. If I find something out, I will update this thread.

Online	Online
Last Visited	‎02-01-2026 06:18 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-01-2026 06:18 PM
Posts	3,427
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: Nifi cluster changing primary node very often

Re: NiFi UnknowHostException in Cluster

Re: The value of NiFi variables is not getting pop...

Re: The value of NiFi variables is not getting pop...

Re: The value of NiFi variables is not getting pop...

Re: HDF/NiFi Improving the performance of your UI

Re: Nifi UI Working very Slow, How to increase per...

HDF/NiFi Improving the performance of your UI

Re: Nifi UI Working very Slow, How to increase per...

Re: How to calculate Nifi threads.