Member since
07-30-2019
3373
Posts
1616
Kudos Received
998
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
30 | 10-20-2025 06:29 AM | |
170 | 10-10-2025 08:03 AM | |
146 | 10-08-2025 10:52 AM | |
142 | 10-08-2025 10:36 AM | |
203 | 10-03-2025 06:04 AM |
08-25-2025
05:20 AM
@HoangNguyen Keep in mind that the Apache NiFi Variable Registry no longer exist in Apache NiFi 2.x releases and there is no more development of the Apache NIFi 1.x versions. NiFi Parameter Contexts, which were introduced in later versions of Apache NiFi 1.x, provides similar capability going forward and should be used instead of the variable registry. You'll be forced to transition to Parameter Contexts in order to move to Apache NiFi 2.x. versions. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
08-12-2025
05:29 AM
@asand3r Here are my observations from what you have shared: You appear to be having a load balancing issue. The LB icon on the connection indicates it is trying to actively load balance FlowFiles in that connection. If load balancing is complete the icon will look like this: . The shared queue counts show that 1 node has reached queue threshold which would prevent it from receiving any more FlowFiles which includes from other nodes. So I assume that the ~4,600 FlowFiles on first 2 nodes are destined for that third node but can not be sent because of queue threshold. Considering the observation above, i would focus you attention on the node with the queue threshold. Maybe disconnect it from cluster via the cluster UI and inspect the flow on that node directly. Check the logs on that third node for any reported error or warn issues. Perhaps EvaluateJson processor only on node 3 is having issues? Maybe connectivity between first two nodes and and third node is having issues. Node 3 can't distribute any FlowFiles to nodes 1 or 2. And nodes 1 and 2 can't distribute to node 3. Maybe some sync issue happened node 3 for some reason has processor stopped. This may explain the stop and start of canvas getting things moving again. If you just disconnect node 3 (one with queue threshold exceeded) only and then reconnect it back to cluster, do FlowFiles start moving? At reconnection, node 3 will compare its local flow with cluster flow. If you remove the LB connection configuration, do FlowFiles get processed? I am curious why in this flow design you have setup a LB connection after the ConsumeKafka processor. This processor creates a consumer group and should be configured according the number of partitions on the target topic to maximize throughput and preventing rebalancing. Let's say your topic has 15 partitions for example. Your consumeKafka processor would then be configured with 5 concurrent tasks. 3 nodes X 5 tasks = 15 consumers in the consumer group. each consumer is assigned to a partition. This spread the consumption across all your nodes removing the need for the load balanced connection configuration. You are using a rather old version of Apache NiFi (~4 years old). I'd encourage you to upgrade to take advantage of many bug, improvements, and security fixes. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
08-11-2025
10:26 AM
@AlokKumar User authentication using OpenID Connect: OpenID Connect If you found that any of the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
08-11-2025
10:22 AM
1 Kudo
@AlokKumar Apache NiFi out-of-the-box configuration utilizes the Single-User-provider for user authentication and the Single-User-Authorizer for authorization. When using the Single-user-authorizer provider, you can not manage authorizations for additional users. So with this out-of-the-box setup you will not see either the Users or Policies options in NiFi since only the single user generated by NiFi an authenticate and that user has full access. These providers were created so that NiFi would be secured (HTTPS) on startup and would be accessible with modern browsers that now all redirect any http request to https. Prior to Apache NiFi 1.14 these single-user providers did not exist. The out-of-the-box setup of NiFi was unsecured. Users were required to create their own certificates, setup an external mechanism for user authentication, and setup an authorizer that could manage authorization policies for multiple user identities. CFM 2.0.4 referenced in the doc link you shared is based off Apache NiFi 1.11.4. Also worth noting is that Cloudera Flow Management (CFM) has never used the single user providers. CFM is designed to deploy/install enterprise ready secured managed Apache NiFi clusters. So before i can help here, I need to understand more about your setup. Which login provider and which authorizer are you using? If you are utilizing the out-of-the-box single user providers, that is the first thing you will need to change. Understand that Apache NiFi does not provide a multi-user login provider. So for multi-user access you'll need to use an external provider or utilize unique clientAuth certificates for each if your users. You can see what options for user authentication exist in the Apache NiFi Admin Guide under the User Authentication section. Lightweight Directory Access Protocol (LDAP) is probably the most commonly used. Now these unique user will require unique authorizations (Multi-Tenant Authorization), which the responsibility falls on the NiFi authorizers.xml configuration file. The most common setup for Apache NiFi will use the StandardManagedAuthorizer. This authorizer would then be configured to reference the FileAccessPolicyProvider which will enable the Policies option in the NiFi UI global menu. In order to set policies against multiple "user identities", this provider must be made aware of all the possible user identities and if you want to authorize by groups, will also need to know what users belong to what groups. So the file-access-policy-provider will require being configured with a user group provider. There are several options, but here are the most common: FileUserGroupProvider - This enables the "users" option in the global menu and allows you to define your user identities and group identities manually. Reminder: This has nothing to do with authentication. LdapUserGroupProvider- This syncs users and groups from an external ldap Composite Implementations - providers that allows multiple user-group-providers to be used at same time. It is common to configure the file-access-policy provider to use the composite-configurable-user-group-provider and then configure the composite-configurable-user-group-provider to obtain users and groups from both the file-user-group-provider and the ldap-user-group-provider. Where the file-user-group-provider is used to manage the user identities derived from your individual NiFi cluster node identities (yes, even your individual nodes in a NiFi cluster require some authorizations). An example authorizers.xml setup utilizing the common setup i described above can be seen here in the NIFi admin guide: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#composite-file-and-ldap-based-usersgroups Hope above helps you get your setup configured to manage multiple unique users. -------------------- Once you have multi-user authentication and authorization setup, you'll need to authorize your users. All users will access the same root Process Group (PG) which is the top level canvas. It is then common for a unique child PG to be created for each unique user or group. You can not hide components on the canvas, this is done to prevent one user/group from building their dataflows on top of other user's dataflows. However, if a user/group is not authorized on a PG, they will only see a dashed outlined PG box and no details. Same goes for processor and other components. And if they try to view the configuration they will not be able to see that either: Keep in mind that processor components will inherit their permissions/authorizations from the parent Process Group. So once you authorize a user or group to child single PG, they will be able to add processors, deeper child PGs, etc within that authorized child PG without needing to set explicit policies on every sub-component. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
08-05-2025
05:32 AM
1 Kudo
@AlokKumar Apache NiFi does not have a configuration option to allow Connection Queues to show more then the highest priority 100. Assuming a normally running dataflow, the listing of a queue would become outdated the moment you listed it as those highest priority would most likely been processed on by the time the listing was produced/returned to UI. You could view provenance history for the processor feeding that connection to get more details on all the FlowFiles passed to that downstream connection. Provenance allows you to look at the current lineage of any FlowFile. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
08-01-2025
06:39 AM
@Krish98 When you secure NiFi (HTTPS enabled), in the TLS exchange NiFi will either REQUIRE (if no additional methods of authentication are configured) or WANT (when additional method of authentication are configured, like SAML) a clientAuth certificate. This is necessary for NiFi clusters to work. Even when one node communicates with another, the nodes to be authenticated (done via a mutual TLS exchange) and authorized (authorizing those clientAuth certificates to necessary NiFi policies). When accessing the NiFi UI, a MutualTLS exchange happens with your browser (client). If the browser does not respond with a clientAuth certificate, NiFi will attempt next configured auth method, it your case that would be SAML. MutualTLS with trusted ClientAuth certificates removes the need to obtain any tokens, renew tokens, and simplifies automation tasks with the rest-api whether interacting via NiFi built dataflows or via external interactions with the NiFi rest-api. The ClientAuth certificate DN is what is used as the user identity (final user identity that needs to be authorized is derived from the DN post any Identity Mapping Properties manipulation). Just like your SAML user identities, your clientAuth certificate derived user identity needs to be authorized to whichever NiFi policies are needed for the requested rest-api endpoint. Tailing the nifi-user.log while making your rest-api calls will show you the derived user identity and missing policy when request is not authorized. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-30-2025
08:49 AM
@PradNiFi1236 Really not sure why NiFi is receiving an Expired Token from your load balancer. This would require some multiple service deep dive to work through which is not something I can do through this community forum. This is where using Cloudera products and having a Cloudera license would enable you to take advantage of the Cloudera Professional Services that can deep dive and dive it to setting up workable environments.
... View more
07-30-2025
08:38 AM
1 Kudo
@asand3r Use a client certificate eliminates need for token when connecting with NiFi. A secured NiFi will always want a client certificate first and only use other authentication methods when a client certificate is not presented in the TSL exchange. This is how NiFi node perform authorized actions between nodes.
... View more
07-30-2025
08:08 AM
@GKHN_ Welcome to the Cloudera Community. You shared numerous configurations and I see numerous configuration issues. Lets start with Authentication and Authorization basics before diving in to the configuration issues. Authentication and authorization are two separate processes. First you need to successfully authenticate your user. At the end of a successful authentication you will have a user identity string (case sensitive) that NiFi uses to identify your authenticated user and it is the user string that is passed to the NiFi authorizer to determine what policies have been granted to that specific user identity string. You appear to be using the ldap-provider (I assume your nifi.properties has been properly configured to use it and you are being presented with the NiFi login screen). I see you have it configured to take your sAMAccountName value as your username at the login window; however; I also see that you have it configured to use the full DN (USE_DN) for your user that is returned by yoru ldap as your user identity string upon successful authentication. I don't think that is what you want here, so I recommend changing from "USE_DN" to "USE_USERNAME" which will pass your username entered in the login window to the authorizer upon successful authentication. Now when we look at the authorizer you shared, you'll want to read it from the bottom up stating with the authorizer (managed-authorizer in your configuration). Within the "managed-authorizer" authorizer, you are configured to used the "file-access-policy-provider", so you should scroll up until you find the "file-access-policy-provider". <accessPolicyProvider>
<identifier>file-access-policy-provider</identifier>
<class>org.apache.nifi.authorization.FileAccessPolicyProvider</class>
<property name="User Group Provider">file-user-group-provider</property>
<property name="Authorizations File">./conf/authorizations.xml</property>
<property name="Initial Admin Identity">CN=NAME SURNAME,OU=CompanyUsers,OU=Company,DC=company,DC=entp</property>
<property name="Node Identity 1"></property>
<property name="Node Group"></property>
</accessPolicyProvider> I see you have configured what i assume is your user's full DN (hopefully matching case sensitive with what you see in the upper right corner of the NiFi UI and as seen in the nifi-user.log). This provider will generate the authorizations.xml file ONLY if it does not already exist. So if you make any changes to this provider those changes will not be made to an existing authorizations.xml file. So you'll need to remove this file until you have your setup working for your initial admin to gain access to NiFi. This provider's job is to seed initial policies for your admin user and the nifi nodes in NiFi cluster setup. But in order to seed those policies NiFi needs to know about the configured user identity (DN you set currently). To do that the file-access-policy-provider is configured with a "user-group-provider" which we can see you have set to the "file-access-policy-provider". <userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
</userGroupProvider> We can see in this provider, you have not configured any initial user identities, So NiFi is not going to be able to find the user identity based on the DN set in the file-access-policy-provider in order to seed those initial admin needed policies. The file-user-group-provider will ONLY generate a users.xml file if one does not already exist. So modification to this provider will not make changes to an existing users.xml file. I also see that you have added and configured the "ldap-user-group-provider" in your authorizers, but as you can see from above their is no configured dependency from the authorizer to this user group provider, so it is not being used even though it is configured in the authorizers.xml file. In order for it to be used it must be called by another provider. In your case this would mean adding maybe the "composite-configurable-user-group-provider". This provider allows you reference multiple provider (1 configurable provider like the file-user-group-provider and 1 or more non configurable providers like the ldap-user-group-provider". ( A configurable provider is one that allows you to manually define additional user or group identities directly from within the NiFi UI). Even though your "ldap-user-group-provider" is not being used by yoru authorizer currently, it has several configuration issues. <userGroupProvider>
<identifier>ldap-user-group-provider</identifier>
<class>org.apache.nifi.ldap.tenants.LdapUserGroupProvider</class>
<property name="Authentication Strategy">SIMPLE</property>
<property name="Manager DN">LDAP_USER</property>
<property name="Manager Password">Password1</property>
<property name="TLS - Keystore">/home/nifi/nifi/nifi-2.4.0/conf/srt.pfx</property>
<property name="TLS - Keystore Password">Password</property>
<property name="TLS - Keystore Type">JKS</property>
<property name="TLS - Truststore">/home/nifi/nifi/nifi-2.4.0/conf/gbkeystore.jks</property>
<property name="TLS - Truststore Password">Password</property>
<property name="TLS - Truststore Type">JKS</property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol">TLSv1.2</property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldap://ldap.entp:389</property>
<property name="Page Size"></property>
<property name="Sync Interval">30 mins</property>
<property name="Group Membership - Enforce Case Sensitivity">false</property>
<property name="User Search Base">OU=CompanyUsers,OU=Company,DC=company,DC=entp</property>
<property name="User Object Class">person</property>
<property name="User Search Scope">ONE_LEVEL</property>
<property name="User Search Filter">(sAMAccountName={0})</property>
<property name="User Identity Attribute"></property>
<property name="User Group Name Attribute"></property>
<property name="User Group Name Attribute - Referenced Group Attribute"></property>
<property name="Identity Strategy">USE_USERNAME</property>
<property name="Group Search Base"></property>
<property name="Group Object Class">group</property>
<property name="Group Search Scope">ONE_LEVEL</property>
<property name="Group Search Filter"></property>
<property name="Group Name Attribute">cn</property>
<property name="Group Member Attribute">member</property>
<property name="Group Member Attribute - Referenced User Attribute"></property>
</userGroupProvider> Lets start with the face that the following is not even a property in this provider (It only exists in the ldap-provider found in the login-identity-providers.xml configuration file: <property name="Identity Strategy">USE_USERNAME</property> While the following property exist in both the ldap-user-group-provider aand teh ldap-provider, its configuration in the ldap-user-group-provider is incorrect: <property name="User Search Filter">(sAMAccountName={0})</property> The "{0}" when used in the ldap-provider with the login-identity-providers.xml will substitute in the username provided at login. The ldap-user-group-provider is syncing users without any external input so this would be treated as a literal a result in no ldap returns. Typically you would use filters here just like you would with ldapsearch to limited the number of user returned (for example filter only user that are members of specific ldap groups). I also see you have group search partial configured, but have no Group Search Base configured. You also have no "user identity Attribute" configured which tells NiFi which ldap field contains the user identity NiFi will then use. This might be where you put "sAMAccountName". I recommend going back to the NiFi admin guide and looking at the example configuration found below the StandardManagedAuthorizer section. The fact that you stated you added user and set policies via the NiFi UI, tells me at some point in time you had a different configuration then shared above that resulted in your initial admin gaining access. Always remember that NiFi is case sensitive and the users identity (whether it is the username entered in login window or the user's full DN) must match exactly with the user identity you are authorizing against the various policies. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-30-2025
07:17 AM
@justloseit NiFi Process groups are just logical containers for processors. A Process group does not run/execute. Selecting "Start" on a process group triggers starting of all the components within that process group. In your case it sounds like you have have setup cron scheduling on your ingest/starting processor(s) within the process group. All downstream processors to that source should be set run all the time and not cron based scheduling. So what you are really looking for is how long it took the processors within that process group to process all produced FlowFiles to point of termination? Besides looking at the lineage data for each FlowFile that traverses all the processor in a process group, I can't think of how else you would get that data. Take a look at the SiteToSiteProvenanceReportingTask available in Apache NiFi. It allows you send the provenance data (produces a lot of data depending on size of yoru dataflows and amount of FlowFiles being processed) via NiFi's Site-To-Site protocol to another NiFi instance (would recommend a separate dedicated NiFi to receive this data). You can then build a dataflow to process that data how you want to retain what information you need, or send it to an external storage/processing system. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more