Member since
07-30-2019
3472
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 248 | 06-03-2026 06:06 PM | |
| 523 | 05-06-2026 09:16 AM | |
| 1024 | 05-04-2026 05:20 AM | |
| 581 | 05-01-2026 10:15 AM | |
| 696 | 03-23-2026 05:44 AM |
03-21-2023
11:39 AM
@udayAle @ep_gunner When NiFi is brought down, the current state (stopped, started, enabled, disabled) of all components is retained and on startup that same state is set on the components. Only time this is not true is when the property "nifi.flowcontroller.autoResumeState" is set to false in the nifi.properties file. When set to false a restart of NiFi would result in all components in a stopped state. In a production environment, this property should be set to true. Perhaps you can share more details on the maintenance process you are using as I am not clear on how your maintenance is impacting the last known state of some components. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-21-2023
11:32 AM
@srilakshmi NiFi only offers HA at the controller level and not at the data/flowfile level. HA at the controller level is possible due to NiFi's zero master clustering capability that relies on a Zookeeper (ZK) quorum to elect am available NiFi node as the cluster coordinator. If the current elected cluster coordinator goes down, ZK elects another active node to assume this role. The zero master clustering allows you to access your NiFi cluster from any one of the active cluster nodes. Each node in the NiFi cluster has its own identical copy of the flow and its own set of repositories. NiFi nodes can not share repositories. So any queued FlowFile on a node that goes down will remain on that node until it is brought back online. This is what you are observing based on your description provided. When you execute the command to shutdown NiFi, it does initiate a graceful shutdown. The amount time for this graceful shutdown is controlled by this configuration property in the nifi.properties file: nifi.flowcontroller.graceful.shutdown.period the default is 10 seconds. If any active thread does not complete within that graceful shutdown period, the thread is killed with the JVM. This will not result in dataloss since a FlowFile is not removed from the inbound connection of a processor unless the thread completed and FlowFile was successfully transferred to an outbound connection. On startup FlowFile, the NiFi flow is loaded, FlowFile are loaded back in to connections, and then components are enabled and started. I'd be interested in your test dataflow and what logging your are looking for and from which processor component? Have you checked NiFi's data provenance to search for the lineage of your 3 FlowFiles you were missing logging for? If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-21-2023
06:32 AM
@davehkd When your nodes become disconnected, a reason will be logged and also most recent events viewable from within the cluster UI via the NIFi interface. So first question is reason given for node disconnections? Is it reporting a communication exception with Zookeeper or is it reporting disconnection due to lack of heartbeat (more common). Within a cluster a node is elected as the cluster coordinator by ZK, the nodes begin sending health and status heartbeats to that cluster coordinator. Default is every 5 seconds. The elected cluster coordinator expects to receive at least one heartbeat every 8x the configured heartbeat interval, so every 40 seconds. This is a pretty aggressive setting for NiFi clusters under heavy load or high heap pressure caused by dataflow design. So first make sure that every node in your cluster has the same configured heartbeat interval value (mixed values will definitely cause lots of node disconnections). If you are seeing reason for disconnection as lack of heartbeat, adjust the heartbeat interval to 30 seconds. This means a heartbeat would need to missed in a 4 minutes window instead of 40 seconds. As far as GC goes, GC is triggered when Java heap utilization gets around ~80%. How much memory have you configured your NiFi to use? Setting really high for no reason means would result in longer GC stop-the-world events. Generally NiFi would be configured with 16 GB to 32 GB for most use cases. If you find yourself needing more then that , you should take a closer look at your dataflow implementations (dataflows). The NiFi heap holds many things including the following: - fllow.json.gz is unpacked and loaded into heap memory on startup. Flow.json.gz includes everything you have added and configured via the NiFi UI (flows, controller settings, registry clients, templates, etc.). Templates are a deprecated method of creating flow snippets for reuse. They are held in heap because they are part of the flow.json.gz even though they are not part of any active dataflow. Downloading for external storage and deleting from within NiFi will reduce heap usage. - user and groups synced from ldap if using the ldap-user-group-provider. Shoudl make sure that your have configured filters on this provider so that you are liimiting the number of groups and users to only those the will actually be accessing yoru NiFi. - FlowFiles are what you see queued between processor components on the UI. FlowFiles consist of metatdata/attributes about the FlowFile. NiFi has build in swap settings for how many FlowFiles can exist in a given queue before they start swapping to disk (20,000 set via nifi.queue.swap.threshold in nifi.properties). Swap files are always 10,000 FlowFiles. By default, a connection has a backpressure object threshold of 10,000. This means by default a connection will not likely generate a swap file because it is unlikely to reach the swap threshold with these defaults (connection queues are soft limits). So If you have lots of connection with queued FlowFiles, you will have more heap usage. Generally speaking, a FlowFile's default metadata attributes amount to very little heap usage, but users can write whatever they want to FlowFile attributes. If you extracting and writing larges amounts of content to FlowFile attributes in yoru dataflow(s), you'll have high heap usage and should be question yourself as to why you are doing this. - NiFi processor components - Some processors have resource considerations that users should take in to considerations when using those processors. The embedded documentation within your NiFi will have section for resource considerations under each processor's docs. Look to see if you are using and with heap/memory consideration. Often heap usage can be reduced through dataflow design modifications. I hope these details help you dig into your heap usage and helps you make adjustments to improve your cluster stability. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-20-2023
12:58 PM
@apmmahesh You created certificates for each of your NiFi nodes. Base on exception you shared, it appears that you created DNs for those nodes as following? CN=node1, OU=NIFI CN=node2, OU=NIFI CN=node3, OU=NIFI When you have a NiFi cluster, you can manage that cluster via the UI of any one of the connected nodes. So let's say you authenticate via a mutual TLS handshake to node1 using your CN=admin, OU=NIFI certificate you created for yourself and loaded in your browser. What happens next is node1 wants to show you all the data/details from all three nodes and not just node1, so your request to load the NiFi is sent via proxy by node1 to whichever node is the elected cluster coordinator. That cluster coordinator replicates the request on your behalf to all nodes in the cluster. This how the node1 UI would show you details about connected nodes, queued data from other nodes, etc. This means that node1 would need to be authorized to proxy user requests. So typically on first startup secure NiFi will use the configuration in your authorizers.xml to setup these needed default authorization, but your configuration is missing your nodes, so this was not done. Inside your file user-group-provider, you need to also add your NiFi node DNs as users. <userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"></property>
<property name="Initial User Identity 1">CN=admin, OU=NIFI</property>
<property name="Initial User Identity 2">CN=node1, OU=NIFI</property>
<property name="Initial User Identity 3">CN=node2, OU=NIFI</property>
<property name="Initial User Identity 4">CN=node3, OU=NIFI</property>
</userGroupProvider> Then in your file-access-policy-provider you need to add your nodes so that when it generates the authorizations.xml file, the nodes get authorized to the "proxy user requests" policy: <accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">file-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity">CN=admin, OU=NIFI</property>
<property name="Legacy Authorized Users File"></property>
<property name="Node Identity 1">CN=node1, OU=NIFI</property>
<property name="Node Identity 2">CN=node2, OU=NIFI</property>
<property name="Node Identity 3">CN=node3, OU=NIFI</property>
<property name="Node Group"></property>
</accessPolicyProvider> NOTE: NiFI will only create the users.xml and authorizations.xml files from the above two providers if they do NOT already exist. Making changes to these providers will not result in changes to existing files. The expectation is that after access for yoru initial admin and your proxy nodes is established that all new authorizations are setup via the NiFi UI which will result in updated to these files. So rename your existing users.xml and authorizations.xml before starting yoru NiFi so new get created. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-17-2023
12:59 PM
@wffger2 A flow definition is a snippet of the components contained with in the Process Group (PG) on which the flow definition was exported. The import of a flow definition is handled differently since a flow definition can be imported over and over to the same NiFi or different NiFis. On import the components will be assigned unique new component UUIDs. So when you create/downloaded your flow definition from DEV and then imported to UAT, the UAT components will have different UUIDs. What you should be doing is install a NiFi-Registry [1] that both your DEV and UAT environments can connect to. This allows you to version control a Process Group (PG) on the your DEV environment and then load that version controlled PG to your UAT environment. While the component UUIDs in UAT will still be different from DEV, both PGs will track back to same version controlled flow stored in the NiFi-Registry. as you make changes in DEV to components in the Version controlled PG, the DEV PG will report that local changes exist. You can commit those local changes as a new version of the PG. At which time the same PG on your UAT env will report a newer version being available which you can change to. You will also have ability to see differences/changes between what is most recent version in NiFi-Registry and what is local to each NiFi from NiFi. [1] https://nifi.apache.org/registry.html If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-17-2023
07:30 AM
@anoop89 This is an unrelated issue to this original thread. Please start a new question. Fell free to @ckumar and @MattWho in your question so we get notified. This issue is related to authorization of your user. Thanks, Matt
... View more
03-14-2023
05:56 AM
@srilakshmi The PublishKafka and PublishKafkaRecord processors do not write any new attributes to the FlowFile when there is a failure. It simply logs the failure to the nifi-app.log and routes the FlowFile to the failure relationship. So on the FlowFile there is no unique error written that can be used for dynamic routing on failure. It could be expensive to write stack traces that come out of Client code to NiFi FlowFiles considering FlowFile attributes/metadata resides in the NiFi heap memory. This may be a topic you want to raise in Apache NiFi jira as a feature/improvement request on these processors to get feedback from Apache NiFi community committers. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-13-2023
12:35 PM
@davehkd I am not sure I am clear on the ask. Are you having issues with your 5 node NiFi cluster? As far as certificates go for NiFi, it really does not matter where you obtain them or if you use self-signed (not recommended) as long as the keystore meets the requirements for NiFi. A NiFi node's keystore must meeting the following requirements: 1. Keystore contains only 1 PrivateKey entry. You can not have multiple PrivateKey Entries in the keystore since NiFi will not know which to use. 2. Keystore PrivateKey entry MUST have Extended Key Usage (EKU) of clientAuth and serverAuth, NiFi nodes communicate with one another and thus will act as clients and servers in the TLS exchange. 3. Keystore PrivateKey entry must contain a DNS entry for the hostname on which the certificate is being used. A NiFi node's truststore contains 1 too many trustedCertEntries. It needs to contain the complete trust chain for any client certificates that will be used to authenticate with NiFi via a mutual TLS handshake. This includes the complete trust chain for each node in yoru cluster. A trust chain consist of every intermediate CA public cert all the way to the root CA public cert. The root CA will have the same owner and issuer. The cacerts file that is included with most java distributions is a truststore containing most public signing authorities intermediate and root CAs. You can obtain a verbose listing of your keystore/truststore using the keytool command found in yoru java install <path to JDK>/bin/keytool -v -list -keystore <keystore or truststore filename> From the output verify following on PrivateKey entry: (DNSName will have your nodes hostname) If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-13-2023
07:22 AM
@dyhiamedjouti It would also be helpful if you shared the full version of Apache NiFi that you installed. The latest versions of NiFi start as secured by default. The Single User "username" and "password" are only output to the log the very first time NiFi is started. Subsequent restarts of NiFi service will not log the username and password again. You can stop your NiFi and run the following command to set your own single user identity provider username and password: $ ./bin/nifi.sh set-single-user-credentials <username> <password> Then when NiFi is up and running, you can use your set username and password to access the UI. Not knowing your username and password has nothing to do with the browser not being able to load the NiFi UI for logging in. When you launch NiFi, this starts the NiFi bootstrap process which will then launch a child process which is the main Nifi process. When this sub process starts, logging will begin in the nifi-app.log. The NiFi Ui will not be accessible until this process has loaded completely and successfully. NiFi will log a few lines that contain UI is available at the following URLs. You'll want to verify you find these log lines and the URLs listed. These are the URLs you will use in your browser to access your NiFi. If you do not see the URLs output in the logs, that means NiFi failed to successfully start. Again the nifi-app.log should provide logging details as to why the sub-process failed during the startup process. Commonly a result of misconfiguration. If you are not seeing a nifi-app.log produced, then check for the nifi-bootstrap.log for any exceptions. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
03-09-2023
12:54 PM
@RRosa That particular exceptions seems to point an issue with the ldap-provider configuration in your nifi-registry possible related to the manager DN property not being set. Would need to see your nifi-registry.properties and authorizers.xml to provide more context around the above exception. Yes, OIDC is supported in NiFi-Registry 1.19.1. When access in a secured (TLS/SSL Enabled) NiFi-Registry, the UI is displayed as the "anonymous" user. Only "public" buckets will be visible. In order to login via OIDC, you would need to click on the login via OIDC link in the UI. OIDC properties: nifi.registry.security.user.oidc.discovery.url= nifi.registry.security.user.oidc.connect.timeout=5 secs nifi.registry.security.user.oidc.read.timeout=5 secs nifi.registry.security.user.oidc.client.id= nifi.registry.security.user.oidc.client.secret= nifi.registry.security.user.oidc.preferred.jwsalgorithm= nifi.registry.security.user.oidc.additional.scopes= nifi.registry.security.user.oidc.claim.identifying.user= If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more