Member since
11-22-2025
11
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1396 | 12-11-2025 03:06 PM |
02-04-2026
10:03 AM
@zzzz77 Provenance can be very noisy depending on size of your dataflows and the amount of FlowFIles being processed through those dataflows. The provenance repo has age and size configuration that trigger roll-off of old events. So you may not reach the retention age if you reach size first. Also would not be trying to read provenance files while they are being written to. The SiteToSiteProvenanceReportingTask might be the solution you are looking for in Apache NiFi. This reporting task will send all provenance events over Site-To-Site protocol to a target NiFi where you can then feed them into any long term storage medium of your choice in a human readable format. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-04-2026
09:46 AM
1 Kudo
@zzzz77 I can certainly help you with the structured setup commonly used when integrating NIFi with LDAP. NiFi authentication and authorization are different processes and configurations. You can even authenticate using LDAP and not use LDAP at all during authorization. Also need to be aware that only a secured NiFi setup over HTTPS can support authentication and authorization. Since Authentication needs to happen first, we'll start there. LDAP authentication is configured as a login provider inside the login-identity-providers.xml configuration file: <provider>
<identifier>ldap-provider</identifier>
<class>org.apache.nifi.ldap.LdapProvider</class>
<property name="Authentication Strategy">START_TLS</property>
<property name="Manager DN"></property>
<property name="Manager Password"></property>
<property name="TLS - Keystore"></property>
<property name="TLS - Keystore Password"></property>
<property name="TLS - Keystore Type"></property>
<property name="TLS - Truststore"></property>
<property name="TLS - Truststore Password"></property>
<property name="TLS - Truststore Type"></property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol"></property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url"></property>
<property name="User Search Base"></property>
<property name="User Search Filter"></property>
<property name="Identity Strategy">USE_DN</property>
<property name="Authentication Expiration">12 hours</property>
</provider> The actual configuration is dependent on your LDAP setup. You can refer to the linked documentation for each field. Depending on "Authentication Strategy" setting, TLS properties may not need to be configured. The "identifier" for this provider is "ldap-provider". The "Identity Strategy" is used to decide what string is used as the authenticated users identity. Options are "USE_DN" (use the full DN from the LDAP entry) or "USE_USERNAME" (use the username as typed in the login window). USE_USERNAME is commonly used. This identifier needs to be configured in the nifi.properties file, so NiFi knows which login-provider NiFi should be using. nifi.security.user.login.identity.provider=ldap-provider Now we need to setup the authorizers.xml file so we can setup authorizations for the ldap users. Here you have two options, you can manually add the ldap user identities via the "user-group-provider" or you can sync the user identities directly from ldap using the "ldap-user-group-provider". Sometimes you want both if not all your users/clients are part of LDAP (this applies to user identities derived from clientAuth certificates during a mutualTLS exchange). Both would commonly be necessary for a NiFi cluster setup. Since you are setting up a single instance (non cluster) NiFi, I'll show how to structure your authorizers.xml file using just the ldap-user-group-provider: <userGroupProvider>
<identifier>ldap-user-group-provider</identifier>
<class>org.apache.nifi.ldap.tenants.LdapUserGroupProvider</class>
<property name="Authentication Strategy">SIMPLE</property>
<property name="Manager DN">cn=Manager,dc=nifi,dc=hwx</property>
<property name="Manager Password">password</property>
<property name="TLS - Keystore"></property>
<property name="TLS - Keystore Password"></property>
<property name="TLS - Keystore Type"></property>
<property name="TLS - Truststore"></property>
<property name="TLS - Truststore Password"></property>
<property name="TLS - Truststore Type"></property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol"></property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldap://<ip or hostname>:389</property>
<property name="Page Size">500</property>
<property name="Sync Interval">30 mins</property>
<property name="Group Membership - Enforce Case Sensitivity">false</property>
<property name="User Search Base">ou=People,dc=nifi,dc=hwx</property>
<property name="User Object Class">inetOrgPerson</property>
<property name="User Search Scope">SUBTREE</property>
<property name="User Search Filter"></property>
<property name="User Identity Attribute">cn</property>
<property name="User Group Name Attribute">memberOf</property>
<property name="User Group Name Attribute - Referenced Group Attribute"></property>
<property name="Group Search Base">ou=Group,dc=nifi,dc=hwx</property>
<property name="Group Object Class">groupOfNames</property>
<property name="Group Search Scope">SUBTREE</property>
<property name="Group Search Filter"></property>
<property name="Group Name Attribute">cn</property>
<property name="Group Member Attribute">member</property>
<property name="Group Member Attribute - Referenced User Attribute"></property>
</userGroupProvider>
<accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">ldap-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity">nifiadmin</property>
<property name="Node Identity 1"></property>
<property name="Node Group"></property>
</accessPolicyProvider>
<authorizer>
<identifier>managed-authorizer</identifier>
<class>org.apache.nifi.authorization.StandardManagedAuthorizer</class>
<property name="Access Policy Provider">file-access-policy-provider</property>
</authorizer> Above authorizer is the most basic setup example assuming an unsecure ldap setup as the example. You can see it has three sections. The bets way to read an authorizers.xml configuration is from the bottom up starting with the "authorizer". In this example you can see I am using the "StandardManagedAuthorizer" which has an identifier of "managed-authorizer" and it is configured to reference the "file-access-policy-provider". So the next provider we should find going up through the authorizers.xml will be the provider with the identifier "file-access-policy-provider". The "FileAccessPolicyProvider" is responsible for persisting the granted authorizations in a file name "authorizations.xml". This provider will also set some initial authorizations for the user identity set in the "Initial Admin Identity" field and the for any "Node Identity <num>" field entries. We can see that this provider is learning about users and groups from the "ldap-user-group-provider". IMPORTANT NOTES: This provider will only create the authorizations.xml file if it does NOT already exist. So if you make any changes to this provider, those changes would not be reflected in an already existing authorizations.xml file. Also any identity strings set this provider must be returned by a user-group-provider(s). So the next provider needed has the identifier "ldap-user-group-provider" and needs to be located further up in this authorizations.xml file. So we locate the "LdapUserGroupProvider" which has this identifier. This provider has no reference to any additional providers. While i shared a very basic sample configuration, your configuration will be specific to your ldap server source. My example is configured to sync users and groups from ldap. You can choose to sync users or users and groups. You can not sync just groups. Inside the nifi.properties file you will set the authorizer you want to use: nifi.security.user.authorizer=managed-authorizer Now that we have the authentication and authorization setup complete, let's walk through what happens when you access NiFi's "https://<hostname>:<port>/nifi" url. A mutualTLS exchange with the client (browser) will occur where NiFi will "WANT" a clientAuth certificate. Of one is not presented in that exchange, NiFi will redirect to the login UI: Here the user will supply their ldap username and password. Assuming the ldap-login-identity-provider is using "USE_USERNAME" and authentication was successful, the username (case sensitive) as typed in the username field will be passed to the managed authorizer to check what authorizations are in place for that user. Before that user identity reaches the managed authorizer, it is compared against the any Identity Mapping Properties configured in the nifi.properties file to see if any string manipulation should happen. Next the string (manipulated if mapping was applied) goes to the authorizer. First the authorizer will check to see if that user identity belongs to any groups. Then it will check if the user or any groups that user is known to be member of (based on returns from ldap-user-group-provider sync) has proper authorizations to access the NiFi UI. If proper authorization exist, you will see the NiFi UI and the user identity will show in the upper right corner. If there are authorization issues, you'll find that logged in the nifi-user.log. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-02-2026
07:57 AM
@zzzz77 FlowFile Metadata/attributes are held in NiFi Heap memory. For queued FlowFiles, there is a configurable swap threshold in the nifi.properties that will swap batches of 10,000 FlowFIle's worth for metadata/attributes to disk when the threshold is met. This swapping is there to minimize excessive heap usage when queues grow large. The NiFi Content is not held in heap memory; however, some processor may need to read the content into heap memory for the processor to perform it's function. You will notice if you look at the individual components documentation that a "System Resource Considerations" section exists. If Heap memory usage is a concern for that processor, it will be documented there. SplitContent processor docs example: Processors like SplitContent will hold the all the FlowFile metadata/attributes (not content) for every split FlowFIle being produced in heap memory until all the output FlowFiles have been produced and committed to the downstream connection. These FlowFiles being produced can not be swapped to disk until they committed to the downstream connection. So if a splitContent were to produce 50,000 split FlowFiles, the attributes for all 50,000 would be held in heap. After committed to the downstream connection. 40,000 of those would get swapped to disk based on default swap thresholds. So heap impact would spike but not persist. Since you have not shared the specific of your dataflow in question (which processors you are using), I can't provide any specific feedback. Where is the chunking and de-chunking happening? Sounds like this may be happening at source and at destination. NiFi is just moving these chunks from source to destination. How are you sending the chunks to NiFi and transferring them to destination? Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
12-17-2025
07:13 PM
Thanks Matt, thats a very useful explanation.
... View more
12-11-2025
03:06 PM
Hi Matt, What I discovered is that when running nifi on windows 10 ( and possibly linux as well?), any passwords that have special characters like + or \ etc need to be converted to ascii I think it is. so : / becomes %2F + becomes %2B Once i did this, it worked OK.
... View more
12-02-2025
07:27 AM
Hello @zzzz77, Did the answers help you here? If so, please consider marking the comment that help you as the solution.
... View more
12-02-2025
06:55 AM
@hckorkmaz01 While you are currently still using Apache NiFi 1.x major release version, it has reached end of life and is no longer receiving contributions. As such components will not get library updates or security fixes going forward. Apache NiFi 2.x is currently active major release being contributed to in the community. The PrometheusReportingTask was deprecated in Apache NiFi 1.x and officially removed in Apache NiFi 2.x major release. So I would avoid using it as you will eventually need to move to Apache NiFi 2.x to maintain a secure supported product release. But technically, this reporting task, while not well maintained in the community, is capable of creating a prometheus endpoint which exposes metrics for all components (includes connections) for consumption. That being said, Cloudera has taken steps to create Cloudera versions of many of the deprecated and removed components in Apache NiFi 2.x; as well as, introduced many components not available at all in any Apache release version (PrometheusReportingTask is not one of them that was retained). https://docs.cloudera.com/cfm/4.11.0/nifi-components-cfm/components/ NOTE: You are already using a considerably older Apache NiFI 1.18 release. Many bug fixes and CVEs security issues have been addressed since that release. If you cannot yet move to Apache NiFi 2.x, you should at least be on the most recent release of Apache NIFi 1.28. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more