About MattWho

MattWho · ‎04-21-2025

@Shrink Not sure why you would want to "control processors" within dataflows in NiFi. This is not typically a good design choice. From the image shared, I see the NiFi URL is "http" and not "https". If your have you NiFi setup unsecured (HTTP), no NiFi authentication is going to be used even if setup in the NiFi core configuration files. If your NiFi was secured (HTTPS), you would still need a StandardRestrictedSSLContextService to at least provide a truststore that contained the complete trust chain for the ServerAuth certificate used in the NiFi nifi.properties keystore to establish the 1-way TLS connection before you would be able to redirect to the oauth2 endpoint to get a token. The other issue is you are not fetching a token from your oauth2 provider, but rather trying to fetch a token from whichever NiFi login provider you have configured. That is because the "rest-api/access/token" endpoint is used for the NiFi login providers (ldap-provider or kerberos-provider). When I use InvokeHTTP processor to access NiFI's rest-api, i always use MutualTLS authentication. I find it the easiest because you don't need to worry about managing tokens or fetching new tokens each time the expire. To support mutualTLS authentication, a StandardRestrictedSSLContextService would need to be used in the InvokeHTTP processor and Your NiFi would need to be secured (HTTPS). Your secured NiFi will have a keystore and truststore setup in nifi.properties file. NiFi out-of-the-box will generate generic self-signed keystore and truststore files for you. I strongly encourage you to use properly signed certificates in production. Simplest approach to set this up is to simply use the same keystore and truststore from the nifi.properties in the StandardRestrictedSSLContextService you'll use with the invokeHTTP processor. You'll need to make sure that the NiFi ClientAuth certificate DN from the keytsore is properly authorized for the NiFi rest-api endpoints you want to use (this of course also means you are NOT using the out-of-the-box single-user-authorizer and instead using an authorizer that allows you manage user authorizations manually like the StandardManagedAuthorizer.) NOTE: the Apache NiFi 2.0.0 releases where tech preview. NiFi 2.1+ are the official releases of the new 2.x line. 2.0.0-M4was last TP release, so it is going to be pretty close to the first GA release. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-15-2025

@nifier Not much insight I can offer here without you sharing the full nifi-api request being made. Are you using curl? How are you passing your user authentication in the rest-api call? (The exception you shared points at an authentication issue) Thank you, Matt

MattWho · ‎04-11-2025

@nifier NiFi is flow based programming, so tuning is directly related to the dataflows you build and the volumes of data you process. Optimal values come from testing your "program" dataflow. Aside from NiFi settings in NiFi you have to consider any other program running on the same server as NiFi as they will also consume CPU resources. "Max Timer Driven Thread Count": This is the thread pool for all your timer driven/cron driven components you add to your NiFi canvas. The general guidance here is start with 4 X the number of cores. So for you that would be 16. Then you'll need to test/monitor your server CPU load average while your dataflow(s) aer running under expected loads. If your load average is very close to or exceeds your core count, you'll need to back off on the size of your thread pool. "Concurrent task" Configurable setting on processor components that allow multiple concurrent executions of the processor. This works in conjunction with the runs schedule set on the processor. When a a processor is "scheduled" the processor will need a thread from the thread pool. Assuming concurrent task more then default 1, If there is still "work" to do at next scheduled run time and the previous concurrent task is still executing or pending execution, the additional concurrent task will allow another concurrent execution to get scheduled. Scheduled i s different from execution. You can have a bunch of scheduled tasks waiting for an available thread from the thread pool to execute. With most processors, execution is milliseconds, so working through the pool of scheduled processors is fast and efficient. The general guidance with "concurrent tasks" is start with default of 1 concurrent task, monitor your dataflows and adjust in increments of 1 only where is needed. Dataflow developers tend to make the mistake of setting some larger value from the start which is a bad idea. You'll want to look at your dataflows under load and see at which processor furthest down your dataflow path is developing an ever increasing backlog of FlowFiles on its inbound connection. A growing backlog on a connection will eventually trigger backpressure controls to kick in (connection turns red). Once back pressure kicks in the upstream processor feeding that connection will no longer we allowed to schedule until the downstream connection backpressure is lifted. So this a processor that is blocked will start queuing FlowFile upstream from it. This can lead to backpressure all the way to start of the dataflow. So do NOT simply adjust concurrent task on all these processors, instead only increase concurrent task on the one furthest down the dataflow to unblock back pressure there which will naturally allow upstream to get scheduled again. The dangers of of setting large concurrent task values is you end up with a lot more scheduled tasks all waiting for CPU time. If you set concurrent task high on CPU intensive processor, those processor may take all your CPU preventing other processors from getting an opportunity to execute for long periods of time. NOTE: The embedded documentation for each NiFi processor has a resource consideration section that will highlight if the processor has MEMORY or CPU resource considerations. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-11-2025

@nifier I am not seeing the same behavior as you have reported which make me think your extracted.path is not not really 0 bytes. I suggest adding another dynamic property to your update Attribute processor to output the calculated length to verify. Here is what i see after my UpdateAttribute processor (newdir is set using your shared NiFi Expression Language (NEL) statement). So I created extracted.path with a single white space and then see what you describe, but that whitespace is a byte, so expected output: If it turns out your problem is because you have white space in the Extracted.path attribute, you could modify your NEL statement as follows: ${extracted.path:trim():length():gt(0):ifElse(${extracted.path:append('/'):append(${filename})},${filename})} You use the trim() NEL function to remove leading or trialing whitespace from the extracted.path before calculating length. This includes trimming of a line return. So in below you will see desired output even though extracted.path length is not 0 because I trimmed white space or line return. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-10-2025

@sha257 Reviewing all your configuration files is not the first step here. I suspect your MutualTLS exchange between your NiFi and NiFi-Registry is not successful resulting in that connection resulting as "anonymous" user. (the user you see in the NIFi-Registry UI in upper right corner before you click login to access as a different authenticated user. Put the following class in DEBUG on your NiFi-Registry via the logback.xml:org.apache.nifi.registry.web.security.authentication Start tailing the nifi-registry-app.log You'll start seeing some DEBUG log lines (will be noisy) Then attempt to start version control on some process group in NiFi which will open the version control UI n NiFi. In the nifi-registry-app.log at that moment in time you will see one of two things: 2025-04-10 13:54:53,360 DEBUG org.apache.nifi.registry.web.security.authentication.IdentityFilter: Attempting to extract user credentials using X509IdentityProvider 2025-04-10 13:54:53,361 DEBUG org.apache.nifi.registry.web.security.authentication.IdentityFilter: Adding credentials claim to SecurityContext to be authenticated. Credentials extracted by X509IdentityProvider: AuthenticationRequest{username='<NIFI certificate DN>', credentials=[PROTECTED], details=org.apache.nifi.registry.web.security.authentication.x509.X509AuthenticationRequestDetails@713e0007} above tells you NiFi presented a trusted and clientAuth certificate at you will see that certificate's DN. In this case make sure that DN exists as a user in NiFi-Registry (case sensitive) and give that user read on "Can manage Buckets" and read, write, delete on "Can proxy user requests". or you'll see.... 2025-04-10 14:01:24,162 DEBUG org.apache.nifi.registry.web.security.authentication.IdentityFilter: Attempting to extract user credentials using X509IdentityProvider 2025-04-10 14:01:24,162 DEBUG org.apache.nifi.registry.web.security.authentication.x509.X509CertificateExtractor: No client certificate found in request. 2025-04-10 14:01:24,162 DEBUG org.apache.nifi.registry.web.security.authentication.IdentityFilter: Attempting to extract user credentials using JwtIdentityProvider 2025-04-10 14:01:24,163 DEBUG org.apache.nifi.registry.web.security.authentication.AnonymousIdentityFilter: Set SecurityContextHolder to anonymous SecurityContext Above tells you the mutualTLS exchange was not successful and the connection was established as the "anonymous" user. In this case you need to address your certificate issue so that mutualTLS can be successful. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-09-2025

@sha257 You already shared your authorizations.xml file which shows where you are lacking the needed NiFi node certificate authorization for proxy user requests. NiFi will attempt connect to NiFi-Registry using the NifiRegistryFlowRegistryClient setup in NiFi. The NifiRegistryFlowRegistryClient has two configuration properties: The URL is set to "https://<NiFi-Registry Hostname>:<NiFi-Registry-port>/" The SSL Context Service if NOT set will use the keystore and truststore setup in the nifi.properties file. The keystore and truststore used (whethere defined via an SSL Context Service or using those from nifi.properties file MUST be capable of successfully negotiating a MutualTLS exchange between the NifiRegistryFlowRegistryClient and NiFi-Registry. This means the ClientAuth Private key in the keystore must be trusted by the truststore configured in NiFi-Registry.properties file. This also means the ClientAuth PrivateKey entry in the NiFi-Registtry keystore file must be trusted by the truststore in the NifiRegistryFlowRegistryClient. If this mutualTS exchange is not successful, the connection with your NiFi-Registry will be as the "anonymous" user identity. You could use the output from openssl to see if mutualTLS connection would be possible by looking at output from both below: openssl s_client -connect <nifi-registry-hostname>:<nifi-regsitry-port> -showcerts openssl s_client -connect <nifi-hostname>:<nifi-port> -showcerts Above assume your NifiRegistryFlowRegistryClient is using the same keystore and truststored used in the nifi.properties file. Alternatively, you could get the verbose listing from your NifiRegistryFlowRegistryClient configured keystore and trustore and compare those with the openssl output from NiFi-Registry to validate if a mutualTLS connection is possible. ------------------------------ Assuming above is all good, then you need to verify proper authorizations are setup to allow the NifiRegistryFlowRegistryClient client identity (derived from the privateKeyEntry DN from the NifiRegistryFlowRegistryClient keystore) has the proper authorizations setup in NiFi-Registry. The NifiRegistryFlowRegistryClient keystore PrivateKey DN must eb authorized in NiFi-Registry for the following policies: "Can Manage buckets" (Read) "Can proxy user requests" (Read, Write, Delete) Above allows the NifiRegistryFlowRegistryClient to be able to see all the NiFi-Registry buckets and to make Read, Write, or Delete request on behalf of the user identity (user string shown in upper right corner of NiFi Ui) authenticated in NiFi. Once above is all good, we move on to what permissions the NiFi user that is being proxied needs in NiFi-Registry. The user identity need to be authorized on each bucket from which or two which flow definitions will be read or written. Here is a Sample public bucket ("Make publicly Visible" is checked): Since this bucket as "Make publicly Visible" checked, any user identity and anonymous users can read all flow definitions added to this bucket (This is what allowed you to import a flow definition from your public bucket to your NiFi canvas). You'll also notice that I have authorized by "nifiadmin" user identity read, write, delete on this bucket. This allows my "nifiadmin" user identity to add or update (write) flow definitions into this bucket and delete flow definitions from this bucket. From what you shared via the authorizations.xml, you have authorized "abc123" user identity read, write, delete on your public bucket. So only thing to verify here is that "abc123" user identity is the exact same case sensitive user identity being displayed in the upper right corner of your NiFi UI when you are authenticated into your NiFi. If they don't match then they are treated as different users. It is the user identity shown in your NiFi that is being proxied and needs to be properly authorized in NiFi-Registry as above example. (I know "abc123" is an example, but you get the idea here). Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-08-2025

@sha257 It appears from the authorizations.xml file you shared, your issue is also affected by your NiFi node(s) not being authorized to /proxy (Read, Write, Delete) in NiFi-Registry that I mentioned in my earlier response. I see only user identifier: 71b266f5-7764-3ff5-a812-80112278b50c Which from your users.xml is your "abc123" user identity. So when the NiFi node attempts to proxy a request on behalf of the user identity authenticated in NiFi, the NiFi node's clientAuth certificate is passed in the connection to NiFi-Registry to authenticate the node and the is checked fro proxy. If that mutual TLS exchange is not successful. the node connects as anonymous (which only has read on public buckets). There are multiple layers here. When you setup the NiFi Flow Registry in NiFi, It gives you the option to define a StandardSSL Context Service. The keystore and truststore defined are used in that mutualTLS exchange. When you don't define a StandardSSLContext Service, NiFi will default t using the keystore and truststore defined in the nifi.properties file. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-08-2025

@nifier The NiFi EL you are using is valid and works. So this raise the question as to where you are trying to use it? What version of Apache NiFi are you using? Which NiFi processor are you using this in? Which Processor property are you using it in? NOTE: Make sure the processor property supports NiFi EL. You can't use EL in every property. I validated your EL using the UpdateAttribute processor. I do have another question about your ifElse: Why are you appending a filename to path? A more typical approach would be t simply use ${extracted.path}/${filename} in the processor that writes file out to destination. If extracted.path is empty or does not exist it returns nothing. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎04-07-2025

@Fanxxx You have what sounds like a rather complex use case here with numerous outputs, timing controls, and routing requirements. Control Rate is very basic in nature (allow X FlowFiles to pass every X amount of time) which depending on volume of FlowFiles can lead to a backlog that ends up resulting in most request to fail after your 5 second requirement (including new that end up delayed more the 5 seconds because they are still queued up behind other FlowFiles behind yoru ControlRate.) Cloudera offers professional Services to its licensed users that can help design and implement complex use cases. Assisting you through the community would require considerable back and forth and exchange of information to include test files, etc. Thank you, Matt

MattWho · ‎04-07-2025

@nifier Unfortunately not. When the StandardPGPPublicKeyService Controller service is enabled, it loads the Keyring into heap memory. Only stopping will allow you to edit the "Keyring" or allow it to load an updated keyring from the "Keyring File". Likewise, and component that has been configured to use this StandardPGPPublicKeyService must be stopped whenever the Controller Service is disabled because a dependency exists between the two components and thus the components are no longer "Valid" and able to run when the controllers service is disabled. Stopping and Starting the Controller Service gives you the option to start all the dependent processors using it at same time. You could raise an Apache NiFi Jira (https://issues.apache.org/jira/browse/NIFI) for a new feature request around the StandardPGPPublicKeyService Controller Service perhaps asking for ability to update a KeyRingFile while enabled and a specify a re-read interval for reading the KeyRingFile. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎11-08-2025 02:52 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎11-08-2025 02:52 AM
Posts	3,387
Kudos received	1613

Cloudera Community

Re: How to achieve inheritence within Parameter Co...

Re: using nifi as a kafka streaming- real-time str...

Re: using nifi as a kafka streaming- real-time str...

Re: Nifi Registry and LDAP

Re: NiFi logs not rolling over on Windows

Re: how to use StandardOauth2AccessTokenProvider 2...

Re: ListSFTP auth failure

Re: Concurrent task and Max Timer Driven Thread Co...

Re: Escaping forward slash

Re: Nifi : Failed to register with Flow Registry d...

Re: Nifi : Failed to register with Flow Registry d...

Re: Nifi : Failed to register with Flow Registry d...

Re: Escaping forward slash

Re: Throttling In Apache NIFI

Re: PGP encryptio/decryption