About MattWho

MattWho · ‎01-12-2023

@Absolute_Z You can use the NiFi rest-api to set the parameter context on a NiFi Process Group (PG) back to "No Parameter Context". Example: curl 'http://<nifi hostname>:<nifi port>/nifi-api/process-groups/<PG UUID>' -X 'PUT' -H 'Content-Type: application/json' --data-raw '{"revision":{"clientId":"a723bd7b-0185-1000-efb1-be534ab7a455","version":1},"disconnectedNodeAcknowledged":false,"component":{"id":"<PG UUID>","name":"<PG name>","comments":"","parameterContext":{"id":null},"flowfileConcurrency":"UNBOUNDED","flowfileOutboundPolicy":"STREAM_WHEN_AVAILABLE","defaultFlowFileExpiration":"0 sec","defaultBackPressureObjectThreshold":"10000","defaultBackPressureDataSizeThreshold":"1 GB"}}' You'll notice in the data passed I have "parameterContext":{"id":null} which will clear all set parameter contexts on the specified Process Group. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎01-09-2023

@davehkd You are definitely having issue with your embedded zookeeper. The shared exception indicates NIFi is not able to communicate with it. The KeeperLoss exception thrown by the ZK client in NiFi most commonly is seen when the ZK does not have quorum because not enough nodes are part of ZK cluster or not all nodes are able to talk to one another. A New NiFi with no flows to load should start very quickly (never 12+ hours). Yours is unable to connect to the zookeeper that is included with and started as part of the NiFi process startup. 1. I'd start with making sure all of your nodes can resolve the zookeeper hostnames (nificlient1, nificlient2, and nificlient3) to proper reachable IP addresses. If not make sure you add these to your local hosts file on each server so this is possible. 2. I'd check to see if zookeeper server is running on each host and listening on the configured ports. Make sure no other processes are using those ports thus blocking zookeeper from being able to bind to them. 3. Make sure you don't have any firewalls blocking connections to the zk ports. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎01-05-2023

@RodolfoE Try using the absolute path to the certificate pem file rather than just the filename. If you are executing your curl command from within the directory where your pem file resides, try using "./<pem filename>". Otherwise curl may try looking up yoru pem filename as if it were an alias/nickname it expects find in some NSS database. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎01-05-2023

@davehkd Those log lines look like expected output in the nifi-app.log during startup process. Niether is an ERROR. - Are you using the embedded zookeeper or an external ZK (strongly recommended)? The first indicates that ZK has not elected a cluster coordinator yet. This can happen if ZK does not finished coming up yet or does not yet have quorum. ZK required an odd number of hosts (3, 5, etc) to achieve quorum and without quorum will not function. 3 is the recommended number of ZK hosts to support NiFi. Once ZK is up and established quorum, I'd expect that WARn log message to go away. The second info message simply means that this node was unaware of an elected cluster coordinator and has requested to be elected to that role. ZK responded that it had already elected some other node as the cluster coordinator. This node should receive the elected cluster coordinator from ZK and you should then start seeing in the logs your nodes sending heartbeat messages to the elected cluster coordinator (even the elected cluster coordinator when send a heartbeat to itself.). Only the cluster coordinator will log receiving and processing x number of received heartbeats. My guess here is that you may not have given it enough time to full launch. When you start NiFi via "../bin/nifi.sh start", it executes the bootstrap process, the bootstrap process then kicks off the main child process for NiFi. That process you'll see through the nifi-app.log output as it progresses. NiFi is fully up once you see the log line that states NiFi Ui is available at the following URLs. Now that the NiFi node is fully up it attempts to communicate with ZK and establish itself as part of a cluster. Especially with embedded ZK in use, this can be delayed until all nodes are up so that ZK has quorum. So first node to come up may log more lines like above then last node to finish startup. NiFi handles election based on configuration of these two properties in the nifi.properties file: nifi.cluster.flow.election.max.wait.time (default is 5 mins) nifi.cluster.flow.election.max.candidates (No default, but should be set to number of NiFi instances in cluster) So basically, NiFi nodes will wait up to 5 minutes or until the configured number of candidates have connected with ZK before flow election happens and NiFi finishes coming up. Accessing the UI before this happens would result in flow election still in progress. Make sure that the "../conf/<nifi config files>" are all configured same across all nodes with exception of node specific properties like hostnames, keystores, truststores, etc. Hope that after some additional time, your NiFi cluster did finally come up for you. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎01-04-2023

@sarithe NiFi component processors are part of pluggable nar in NiFi. They are separate from the core NiFi code. Processors are designed to log their output base on their component processor class. It then becomes the responsibility of the logback to route those log messages to the appropriate appender. There is nothing in the log output produced by a component processor that will inherently identify which parent process group it resides within. But if you were to use a consistent processor naming structure in each of your Process Groups (PG), you may be able to setup some creative filtering in logback based on that naming structure. Bulletins however do include details about the parent Process Group in which the component generating the bulletin resides. You could build a dataflow in yoru NiFi to handle bulletin notification through the use of the SiteToSiteBulletinReportingTask which is used to send bulletin to a destination remote import port on a target NiFi. A dataflow on the target NiFi could be built to parse the received bulletin records by the bulletinGroupName json path property so that all records from same PG are kept together. These 'like' records could then be written out to local filesystem, remote system, used to send email notifications, etc... Example of what a Bulletin sent using the SiteToSiteBulletinReportingTask looks like: { "objectId" : "541dbd22-aa4b-4a1a-ad58-5d9a0b730e42", "platform" : "nifi", "bulletinId" : 2200, "bulletinCategory" : "Log Message", "bulletinGroupId" : "7e7ad459-0185-1000-ffff-ffff9e0b1503", "bulletinGroupName" : "PG2-Bulletin", "bulletinGroupPath" : "NiFi Flow / Matt's PG / PG2-Bulletin", "bulletinLevel" : "DEBUG", "bulletinMessage" : "UpdateAttribute[id=8c5b3806-9c3a-155b-ba15-260075ce9a6f] Updated attributes for StandardFlowFileRecord[uuid=1b0cb23a-75d8-4493-ba82-c6ea5c7d1ce3,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1672661850924-5, container=default, section=5], offset=969194, length=1024],offset=0,name=bulletin-${nextInt()).txt,size=1024]; transferring to 'success'", "bulletinNodeId" : "e75bf99f-095c-4672-be53-bb5510b3eb5c", "bulletinSourceId" : "8c5b3806-9c3a-155b-ba15-260075ce9a6f", "bulletinSourceName" : "PG1-UpdateAttribute", "bulletinSourceType" : "PROCESSOR", "bulletinTimestamp" : "2023-01-04T20:38:27.776Z" } If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎01-03-2023

@davehkd The exception you have shared points at the following property being set to false in the nifi.properties file: nifi.cluster.protocol.is.secure=false NiFi nodes communicate with one another over HTTP when this is set to false. When set to true NiFi nodes with communicate with one another over HTTPS. Since you have this set to false, it is complaining that you do not have a your NiFi configured to with an HTTP port in the following property in the nifi.properties file: nifi.web.http.port Out of the box, Apache NiFi is configured to start securely over https as a standalone NiFi instance using the Single-User authentication and single-user-authorizer providers: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#single_user_identity_provider The intent of this provider is to provide a means for easily and quickly starting up a secure NiFi to become familiar with or evaluate NiFi. It gives that single user full access to everything and provides no mechanism for setting up and authorizing any additional users. When switching to a NiFi cluster, You'll need to setup proper authentication and authorization providers that support secure NiFi clusters. In a secured NiFi cluster setup, the NiFi nodes will need to authenticate via their certificates over a mutual TLS handshake (unless set to be unsecure as you have setup which I strongly do not recommend). This in turn means that the NiFi cluster nodes will need to have authorizations setup for proxy, data access, and controller access which the single-user-authorizer does not support. Additionally the single user identity-provider by default on NiFi startup creates a random user name and password which is going to be unique per node. This will not work in a cluster setup since actions performed on node 1 will be replicated to nodes 2 - x nodes as the authenticated user of node 1. However, nodes 2 - x will not know anything about that user and thus fail authorization. The single user authentication provider provides a mechanism for you to set a specific username and password which you could make the same on all instance of NiFi. ./bin/nifi.sh set-single-user-credentials <username> <password> My suggestion to you is to first setup a standalone NiFi securely using yoru own configuration for user authentication and user authorization: For user authentication, follow this section of the admin guide: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication The most commonly used method of user authentication used is the ldap-provider: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#ldap_login_identity_provider For NiFi authorizations, follow this section of the NiFi admin guide: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization The most basic managed setup utilizes all of the following authorization providers in below specific order in the authorizers.xml file: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#fileusergroupprovider https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#fileaccesspolicyprovider https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#standardmanagedauthorizer These are actual in template format in the default authorizers.xml included with NiFi. They are likely commented out. Once you have a a secured standalone NiFi instance working, then I would move on to setting up your NiFi cluster. You'll need to add your NiFi cluster nodes to the authorizers file-user-group-provider and file-access-policy-provider as part of that process which would require you to remove the users.xml and authorizations.xml files generated by those providers so they get recreated to support your initial cluster needed authorizations. These files are only generated by those providers if the do NOT already exist. Config changes in the providers will not trigger new or modified files. I know there is a lot to take in here, but this will set you up in the best possible way for success. If you found that this response helped with your query, please take a moment to login and select "Accept as Solution" below each response the helped you. Matt

MattWho · ‎12-21-2022

@samrathal 1. What is the purpose of the SplitJson in your dataflow? 2. If you have 1 FlowFile with 1000 records in it, why use SplitJson to split that in to 1000 FlowFiles having 1 record each? Why not just merge the larger FlowFiles with multiple records in it? Or am i missing part of the use case here? --- Can you share a template of flow definition of yoru dataflow? 1. It is not clear to me how you get "X-Total-Count" and how you are adding this FlowFile attribute to every FlowFile. 2. You have configured the "Release Signal Identifier" with a boolean NiFi Expression Language (NEL) that using your example will return "false" until "fragment.count" FlowFile attribute value equals the FlowFile attribute "X-Total-Count" value. 2a. I assume you are writing "X-Total-Count" to every FlowFile coming out of the SplitJson? How are incrementing the "fragment.count" across all FlowFile in the complete 5600 record batch. Each FlowFile that splits into 1000 FlowFiles via splitJson will have fragment.count set to 1 - 1000. So fragment.count would never reach 5600 unless you are handling this count somewhere else in your dataflow. 2b. If a FlowFile where value from "fragment.count" actually equals value from "X-Total-Count" attribute, your "Release Signal Identifier" will resolve to "true". The ""Release Signal Identifier" value (true or false) in your configuration is looked up in the configured "distributed map cache server. So where in your dataflow to you write the release signal to the distributed map cache? (usually handled by a notify processor) I am in no way implying that what you are trying to accomplish can't be done. However, coming up with an end-to-end workable solution requires knowing all the steps in the use case along the way. I would recommend going through the example Wait/Notify linked in my original response to get a better understanding of how wait and notify processors work together. Then maybe you can makes some changes to your existing dataflow implementation. With more use case details (detailed process steps) I could suggest further changes if needed. I really hope this helps you get some traction on your use case here. If you have a contract with Cloudera, you can reach out to your account owner who could help arrange for professional services that can work with your to solution your use cases in to workable NiFi dataflows. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎12-21-2022

@samrathal The "Wait" processor works in conjunction with the "Notify" processor in NiFi. See below example use case: https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-merge/ And simply waiting until you have received all 1000 record record batches will not ensure a downstream MergeContent or MergeRecord processor will merge them all together. 1. Is this a one time execution flow? 2. if not, how do you differentiate between different complete batches (when does new one merge bundle end and another begin?)? 3. Are all 1000 records from each rest-api call going into a single NiFi FlowFile or 1 FlowFile per record? 4. Is there some correlation identifier as a rest of rest-api call that identifies all 1000 Record batch pulls as part of same complete bundle? The details of yoru use case would make it easier for the community to provide suggestions. Assuming You have some Correlation Attribute and you know that max number of records would never exceed some upper limit, you may be able to simply use a well configured MergeRecord processor using min records set higher then you would ever expect, a correlation attribute, and a max bin age (forced bin to merge after x amount of time even if min has not been satisfied) to accomplish the merging of all your records. But keep in mind the answers to questions asked play a role in whether this is possible or needs some additional consideration put in place. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎12-21-2022

@anton123 I am still not completely clear on your use, but correct me if below is not accurate: 1. You fetch a single large file. 2. That file is unpacked in to many smaller files. 3. Each of these smaller files are converted in to SQL and inserted via the putSQL processor. 4. You then have unrelated downstream processing you don't want to start until all files produced by the unpackContent processor have been successfully processed by the putSQL processor. Correct? If so, the following exampe use case for the NiFi Wait and Notify processor is probably what you are looking to implement for this use case: https://pierrevillard.com/2018/06/27/nifi-workflow-monitoring-wait-notify-pattern-with-split-and-merge/ If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎12-21-2022

@zIfo The TLS exception "unable to find valid certification path to requested target" is telling you that there is a lack of trust in the handshake. This means that the complete trustchain needed to establish trust is missing from the truststore. This is not an issue with the NiFi InvokeHTTP processor. From command line you could try using openssl to get the public certificates for the trusts chain from the target URL. (note that not all endpoints will return complete trust chain. openssl s_client -connect <FQDN>:<port> -showcerts The server hello in response to this command will have one too many public certs. each cert will have format of below example: -----BEGIN CERTIFICATE----- MIIFYjCCBEqgAwIBAgIQd70NbNs2+RrqIQ/E8FjTDTANBgkqhkiG9w0BAQsFADBX MQswCQYDVQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEQMA4GA1UE CxMHUm9vdCBDQTEbMBkGA1UEAxMSR2xvYmFsU2lnbiBSb290IENBMB4XDTIwMDYx OTAwMDA0MloXDTI4MDEyODAwMDA0MlowRzELMAkGA1UEBhMCVVMxIjAgBgNVBAoT GUdvb2dsZSBUcnVzdCBTZXJ2aWNlcyBMTEMxFDASBgNVBAMTC0dUUyBSb290IFIx MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAthECix7joXebO9y/lD63 ladAPKH9gvl9MgaCcfb2jH/76Nu8ai6Xl6OMS/kr9rH5zoQdsfnFl97vufKj6bwS iV6nqlKr+CMny6SxnGPb15l+8Ape62im9MZaRw1NEDPjTrETo8gYbEvs/AmQ351k KSUjB6G00j0uYODP0gmHu81I8E3CwnqIiru6z1kZ1q+PsAewnjHxgsHA3y6mbWwZ DrXYfiYaRQM9sHmklCitD38m5agI/pboPGiUU+6DOogrFZYJsuB6jC511pzrp1Zk j5ZPaK49l8KEj8C8QMALXL32h7M1bKwYUH+E4EzNktMg6TO8UpmvMrUpsyUqtEj5 cuHKZPfmghCN6J3Cioj6OGaK/GP5Afl4/Xtcd/p2h/rs37EOeZVXtL0m79YB0esW CruOC7XFxYpVq9Os6pFLKcwZpDIlTirxZUTQAs6qzkm06p98g7BAe+dDq6dso499 iYH6TKX/1Y7DzkvgtdizjkXPdsDtQCv9Uw+wp9U7DbGKogPeMa3Md+pvez7W35Ei Eua++tgy/BBjFFFy3l3WFpO9KWgz7zpm7AeKJt8T11dleCfeXkkUAKIAf5qoIbap sZWwpbkNFhHax2xIPEDgfg1azVY80ZcFuctL7TlLnMQ/0lUTbiSw1nH69MG6zO0b 9f6BQdgAmD06yK56mDcYBZUCAwEAAaOCATgwggE0MA4GA1UdDwEB/wQEAwIBhjAP BgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBTkrysmcRorSCeFL1JmLO/wiRNxPjAf BgNVHSMEGDAWgBRge2YaRQ2XyolQL30EzTSo//z9SzBgBggrBgEFBQcBAQRUMFIw JQYIKwYBBQUHMAGGGWh0dHA6Ly9vY3NwLnBraS5nb29nL2dzcjEwKQYIKwYBBQUH MAKGHWh0dHA6Ly9wa2kuZ29vZy9nc3IxL2dzcjEuY3J0MDIGA1UdHwQrMCkwJ6Al oCOGIWh0dHA6Ly9jcmwucGtpLmdvb2cvZ3NyMS9nc3IxLmNybDA7BgNVHSAENDAy MAgGBmeBDAECATAIBgZngQwBAgIwDQYLKwYBBAHWeQIFAwIwDQYLKwYBBAHWeQIF AwMwDQYJKoZIhvcNAQELBQADggEBADSkHrEoo9C0dhemMXoh6dFSPsjbdBZBiLg9 NR3t5P+T4Vxfq7vqfM/b5A3Ri1fyJm9bvhdGaJQ3b2t6yMAYN/olUazsaL+yyEn9 WprKASOshIArAoyZl+tJaox118fessmXn1hIVw41oeQa1v1vg4Fv74zPl6/AhSrw 9U5pCZEt4Wi4wStz6dTZ/CLANx8LZh1J7QJVj2fhMtfTJr9w4z30Z209fOU0iOMy +qduBmpvvYuR7hZL6Dupszfnw0Skfths18dG9ZKb59UhvmaSGZRVbNQpsg3BZlvi d0lIKO2d1xozclOzgjXPYovJJIultzkMu34qQb9Sz/yilrbCgj8= -----END CERTIFICATE----- You can copy each (including the begin and end certificate lines) and place it in different <name>.pem files which you can then import each <name>.pem in to your existing truststore. A complete trust chain consists of all the public cert from signer of hosts cert to the self signed root CA public cert. If that signer cert is self-signed (meaning owner and signer have same DN), then it is considered the root CA. If they are not the same, then another public cert exists in the chain. A complete trust chain means you have all the public certs from the one that signed the target FQDN all he way to the root CA (owner and issuer the same DN). If the output of the openssl does not contain all the public certs in the trust chain, you'll need to get the missing public certs from the source. That source could be the company hosting the server or it could be a public certificate authority (Digicert for example). You would need to go to to those sources to obtain the CA certs (often published on their website (example: https://www.digicert.com/kb/digicert-root-certificates.htm). Another option that may work for root CAs and some intermediate CAs is using java's cacerts file bundle with every java release which contains the public certs for many public authorities. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

Online	Offline
Last Visited	‎07-14-2026 02:35 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎07-14-2026 02:35 AM
Posts	3,472
Kudos received	1638

Cloudera Community

Re: ListenNetFlow processor does not decode Cisco ...

Re: Can we detect who did a particular operation i...

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: Nifi: Context Parameter usage

Re: Errors Encountered When Installing NiFi Cluste...

Re: API curl --cert

Re: Errors Encountered When Installing NiFi Cluste...

Re: Nifi Processor group level logging-Issue

Re: Errors Encountered When Installing NiFi Cluste...

Re: How to apply wait processor for capture comple...

Re: How to apply wait processor for capture comple...

Re: Apache NIFI - How to wait for SQL Insert full ...

Re: InvokeHTTP - HTTPS API calls