About MattWho

MattWho · ‎09-16-2025

@AlokKumar Then you'll want to build your dataflow around the HandleHTTPRequest and HandleHTTPResponse processors. You build your processing between those two processors or maybe you have multiple HandleHTTPResponse processors to control the response to the request based on the outcome of your processing. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-16-2025

@AlokKumar NiFi FlowFiles consist of two parts: FlowFile Metadata/Attributes - stored in the flowfile_repository, it holds metadata about the FlowFile and attributes added to the FlowFile by processors. FlowFile Content - Stored within content claims within the content_repository. A single content claim may hold the content for one too many FlowFiles. Part of a FlowFile's metadata includes the location of the content claim, the starting byte of the content and total number of bytes. There is also a claimant count associated with each content claim. It is incremented for every active FlowFile (a FlowFile still present with a queue on the NiFi canvas) that references content stored in that claim. One a FlowFile reaches a point of auto-termination within a dataflows, the claimant count on the content claim it references is decremented. Once the claimant count reaches zero, the claim is eligible for archive and removal/deletion. Content claims are immutable (can not be modified once created). Any NiFI processor that modifies or creates new content writes that content to a new content claim. Archived content claims are moved to "archive" subdirectories within the content_repository. Archiving can be disable which means that content claims where claimant count is zero are immediately deleted. A background archive thread monitors archived content claims and deletes them based on archive retention settings in the nifi.properties file. A common misunderstanding is how the "nifi.content.repository.archive.max.usage.percentage". Lets say it is set to 80%. Once this disk where the content_repository resides reaches 80% capacity, archive will start purging archived content claims to attempt to bring disk usage below that 80%. If all archived content claims have been deleted, NiFi will continues to allow new content claims to be created potentially leading to disk being 100% full. For this reason it is VERY important that the content_repository is allocated to its own physical or logical disk. File System Content Repository Properties Understanding-how-NiFi-Content-Repository-Archiving-works With NiFi provenance you are seeing Provenance event data which includes metadata about the FlowFile, If the content claim referenced by the FlowFile in the provenance event no longer exists on the content_repository (either inside archive subdirectory or outside archive), you'll have no option to replay or view the content. Provenance is written to its own provenance_repository directory and its retention is also configurable in the nifi.properties file. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-15-2025

@ShellyIsGolden What you describe here sounds like the exact use case for using NiFi's parameter contexts. Parameters can be used in any NiFi component property. They make it easy to build a dataflow in your dev environment and then move that dataflow to test or prod environments that have the same parameter contexts but with different values. This even works when using a shared NiFi-Registry to version control your ready dataflows for another environment. Lets say you create a "Parameter Context" and associate that created parameter context with a Process Group(s). Now you can configure a property in a processor for example and click on "convert to parameter" icon to convert that value into a parameter within yoru parameter context. Lets say you create a Parameter context with name "PostgresQL parameters". Then you can configure your Process Group (PG) to use that parameter context: Now you can configure/convert your component properties that are unique per NiFi deployment environment to using a parameter. Let's say you are ready to move that flow to another environment like prod. So you version control that PG on dev to NiFi-Registry. Then on Prod you connect to that same NiFi-Registry and import that dataflow. When that flow is loaded in Prod, if a parameter context with the exact same name "PostgresQL parameters" already exists, that imported flow will use that parameter context's values. This eliminates the need to manage these configuration all over the place in yoru dataflows. You can also open your parameter context and edit a values and NiFi will take care of stopping and starting all the affected components. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-15-2025

@asand3r With your ConsumeKafka processor configured with 5 concurrent tasks and a NiFi cluster with 3 nodes, you will have 15 (3 nodes X 5 concurrent tasks) consumers in your consumer group. So Kafka will assign two partitions to each consumer in that consumer group. Now if there are network issues, Kafka may do a rebalance and assign more partitions to fewer consumers. (Of course consumers in a consumer group changes if you have additional consumeKafka processors pointing at same topic and configured with same consumer group id. Matt

MattWho · ‎09-12-2025

@Alexm__ While i have never done anything myself with Azure DevOps pipelines, I don't see why this would not be possible. Dev, test, prod environments would likely have slight variations in NiFi configurations (source and target service URLs, passwords/usernames, etc). So when designing your Process Group dataflows you'll want to take that into account and utilize NiFi's Parameter contexts to define such variable value configuration properties. Sensitive properties (passwords) are never passed to NiFi-Registry. So any version controlled PG imported to another NiFi will not have the passwords set. Once you version control that PG, you can deploy it through rest-api calls to other NiFi deployments. First time it is deployed it will simply import the parameter context used in source (dev) environment. You would need to modify that parameter context in test, and prod environments to set passwords and alter any other parameters as needed by each unique env. Once the modified parameter context of same name exists in the other environments, promoting new versions of dataflows that use that parameter context becomes very easy. The updated dataflows will continue to use the local env parameter context values rather then those used in dev. If a new parameter is introduced to the parameter context, is simply gets added to the existing parameter context of the same name in test and prod envs. So there will be some consideration in your automated promotion of version controlled dataflows between environments to consider. Versioning a DataFlow Parameters in Versioned Flows Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2025

@carange Welcome to the Cloudera Community. Opening a community questions exposes your query to anyone who access the Cloudera community site and is a great place to ask very specific issue questions or how-to type questions. Responses may come from any community member (may or may not be a Cloudera employee). For more in depth issues or time sensitive issues where sharing logs or sensitive information would be very helpful, creating a support case is the best option. Or if the suggestions and answers provided in the community are not completely solving your issue. Only individuals with a Cloudera product license can create support cases. With a Cloudera license you are able to raise support cases from MyCloudera that will get assigned to the appropriate support specialist for your issue. Simply open a browser to https://lighthouse.cloudera.com/s/ and login with your Cloudera credential via the following icon: You can then float over the "Support" option and select "cases". This will take you to a new page where you will see an option to "Create A Case": Select a "Technical assistance" case type and follow the prompts to provide the necessary information to submit your case details. You'll have the ability to upload images, logs, etc to your new case. If you have issues creating a case, please reach out to your Cloudera Account owner. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2025

@Alexm__ In order for NiFi to communicate with NiFi-Registry, NiFi needs to have "NiFiFlowRegistryClient" added to "Registry Clients" section in NIFi under Controller settings. A SSL Context Service (in which you can define a specific keystore and truststore that may or may not be the same keystore and truststore your NiFi uses) will be needed since a mutualTLS handshake MUST be successful between NiFi and NiFi-Registry. So for your question, as long as there is network connectivity between your NiFi(s) and the NiFi-Registry, this can work. Your "user identity(s)" in NiFi that will be authorized to perform version control will also need to be authorized in your NiFi-Registry to specific buckets. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-10-2025

@nifier Sharing the details of the rest-api calls you made that are not working along with the specific Apache NiFi version being used may be helpful in providing guidance here. What response are you getting to your rest-ai calls? What do you see in the nifi-user.log and/or nifi-app.log when you execute your rest-api calls? How are you handling user authentication in your rest-api-calls (certificate, bearer token, etc)? rest-api call: https://<nifinode>:<nifiport>/nifi-api/processors/<Processor UUID>/run-status -X PUT -H 'Content-Type: application/json' --data-raw '{"revision":{"clientId":"<ID>","version":<version num>},"state":"<RUNNING, STOPPED, or RUN_ONCE>","disconnectedNodeAcknowledged":false}' --insecure Above would also need a client auth piece. What may be helpful to you is utilizing the developer tools in your web browser to capture the rest-api calls made as you perform the actions via the NiFi UI. Most developer tools give you the option to "copy as curl" the request that was made. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-10-2025

@Alexm__ Welcome to the Cloudera Community. NiFi-Registry provides a mechanism for version controlling NiFi Process Groups (PG). NiFi-Registry can be configured to persist version controlled PGs in Git rather then locally within NiFi-Registry. Authorization policies set within NiFi-Registry will control whom can start version control and in to which Registry bucket that version controlled flow is stored. Authorization policies also control whom can deploy a flow from NiFi-Registry onto a NiFi instance/cluster. A typical setup would have one NiFi-Registry that is accessible by all your Dev and Prod NiFi deployments. When you Dev NiFi version controls a PG, that version controlled PG flow definition is uploaded to NiFi-Registry within a defined bucket. The PG on your Dev NiFi is now tracking against that version controlled flow. If changes are made to the flow on yoru dev NiFi, that NiFi will report "local changes" on the PG which can then be committed as another version of that already version controlled flow. Flow that have been version controlled to a NiFi-Registry are NOT automatically deployed to other NiFi instances/clusters that have access to this same NiFi-Registry. A NiFi-Registry authorized user on one of those other clusters would need to initiate the loading of that version controlled flow on each of the prod NiFis. So controlling whom has access to specific NiFi-Registry buckets is important. This allows you to selectively deploy specific PG to different prod environments. Once these flow are deployed, they will also be tracked against what is in NiFi-Registry. This means that if someone commits a newer version of a flow to NiFi-Registry, any prod env tracking against that flow will show an indicator on the PG that a newer version is available. An authorized user wold be required to initiate the change to that newer version (it is not automatically deployed). Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-08-2025

@yoonli It would be helpful if you shared the complete authorization exception you are encountering. I have a feeling your authorization exception is not related to your server certificate, but more related to your individual NiFi user. Using a load balancer in front of your NiFi cluster would require that session affinity (sticky sessions) is enabled in your load balancer. The why? Any login based user authentication (ldap-provider, kerberos-provdier, etc) result in a token being issued to the user and a server side token stored on the NiFi server that issues the client token. Only the specific node in the NiFi cluster that issued the client bearer token will have the corresponding server side token. If your load balancer does not have sticky sessions enabled subsequent requests after obtaining the client bearer token may get direct to a different node in the cluster. Your browser will include this client token in all subsequent request to NiFi Since the other nodes will not hav the corresponding server token for your user the session would result in an not authorized response. Possible helpful HAProxy links: https://www.haproxy.com/blog/enable-sticky-sessions-in-haproxy https://www.haproxy.com/solutions/load-balancing ---- Certificate based authentication is not an issue since the client/server MutualTLS exchange happens in every communication between client and server. This is why is suspect that your setup involves a login based authentication method. ---- I see you configured your LB IP in the nifi.web.proxy.host property within the nifi.properties file. This property has nothing directly related to client/user authentication. It is about making sure NiFi accepts requests destined for a different hostname/IP then the destination host that received it. Let's say you initiate a connection to URL containing host: https://10.29.144.56/nifi/ Your HAProxy then routes that request to NiFi on host 10.29.144.58 which returns a server certificate with that servers hostname or the IP 10.29.144.58. The connection is going to be blocked because it appears as a man-in-the-middle attack. The expectation was that the request would be processed by the server 10.29.144.56; however, host 10.29.144.58 received the request. By adding 10.29.144.56 to the proxy.host property in NiFi, you are telling NiFi to accept requests intended for a different hostname or IP then the actual NiFi's hostname or IP. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎10-10-2025 06:00 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎10-10-2025 06:00 AM
Posts	3,369
Kudos received	1611

Cloudera Community

Re: Nifi Registry and LDAP

Re: NiFi logs not rolling over on Windows

Re: Nifi Registry and LDAP

Re: Prometheus can't scrape metrics

Re: Using certificate based authorization/authenti...

Re: Soap Webservices in NiFi

Re: What happens to flowfiles in NiFi

Re: Nifi DBCPConnectionPool service not setting s...

Re: NiFi stuck data in queue between processors

Re: Managing NiFi Registry with multiple environm...

Re: Issues Install and Unistall Apache Minifi - Wi...

Re: Managing NiFi Registry with multiple environm...

Re: Unable to stop processor using API

Re: Managing NiFi Registry with multiple environm...

Re: Config nifi behind haproxy