About MattWho

MattWho · ‎06-14-2024

@Pratyush1 The more detail you can provide, the better here. In addition to what @cassie2698bratt suggested: Apache NiFi 1.14 was released 3 years ago and has had many important bug fixed, security fixes, and improvements since then, The latest 1.x release is Apache NiFi 1.26 as of writing this response. I strongly recommend upgrading to the latest release. What Java vendor and version are you using with your NiFi. NIFi supports Java 8 (and 11 in newer releases). Java 8 update 252 or newer is required minimum. All nodes consistently on same version? What OS is being used on server where NiFi is having issue? Is this a NiFi multi-node cluster? Does UI of every node in same cluster present same issue while loading the UI or is it specific to just one node in the cluster? Any custom code add-ons? Are the other systems using the same Java and NiFi versions? Any observations in the NiFi logs when accessing the UI? Thank you, Matt

MattWho · ‎06-14-2024

@helk You can use a single certificate to secure all your nodes, but i would not recommend doing so for security reasons. You risk compromising all your host if any one of them is compromised. Additionally NiFi nodes act as clients and not just servers. This means that all your hosts will identify themselves as the same client (based off DN). So tracking client initiated actions back to a specific node would be more challenging. And if auditing is needed, made very difficult. The SAN is meant to be used to differently. Let's assume you host an endpoint searchengine.com which is back by 100 servers to handle client requests. When a client tries to access searchengine.com that request may get routed to anyone of those 100 servers. The certificate issues to each of those 100 servers is unique to each server; however, every single one of them will have the searchengine.com as an additional SAN entry in addition to their unique hostname. This allows the host verification to still be successful since all 100 are also known as searchengine.com. Your specific issue based on shared output above is caused by the fact that your single certificate does not have "nifi01" in the list of Subject Alternative Names (SAN). It appears you only added nifi02 and nifi03 as SAN entries. The current hostname verification specs no longer use DN for hostname verification. Only the SAN entries are used for that. So all names(hostnames, common names, IPs) that may be used when connecting to a host must be included in the SAN list. NiFi cluster keystore requirements: 1. keystore can contain only ONE privateKeyEntry. 2. PrivateKey can not use wildcards in the DN. 3. PrivateKey must contain both clientAuth and serverAuth Extended Key Usage (EKU). 4. Privatekey must contain at least one SAN entry matching the hostname of server on which keystore will be used. The NiFi truststore must contain the complete trust chain for your cluster node's PrivateKeys. On truststore is typically copied to and used on all nodes. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@Dave0x1 Some general related information: 1. Java uses heap as needed, but for efficiency does not run Garbage Collection (GC) to free unused heap until typically over 80% allocated heap usage. So not unexpected to see heap utilization of 70% even once data is processed out of your dataflow(s). So there is nothing unexpected or alarming specific to that 70% heap utilization in itself. You probably want to look at the GC events (partial and full GC) to see how many and how often they are happening. What are your current XMS and XMX heap setting for your NiFi. Heap is requested during execution of NiFi components. NiFi does not manage the heap or its clean-up, that is a process handled by Java. 2. When a component is configured with "primary node" execution, it will only be scheduled on the currently elected primary node. The FlowFiles generated will then only exist on the primary node unless you design into your dataflow(s) redistribution (typically done via load balance configuration on downstream connection of the primary node execution processor component) of those FlowFiles across all your nodes for further downstream processing. Even with distribution, there will be some deviations in resource usage since you are still doing some additional work on just the primary node. 3. The primary node and cluster coordinator nodes are elected by Zookeeper (ZK) and can change. Commonly there is some event that triggers a change (current primary node stops heart-beating to ZK, current primary node disconnects from cluster, cluster or primary node is restarted, current primary node shutdown, etc.. You could look at the individual node events in the cluster UI to see when the primary node change to see if aligns with any of these event types. But even with a primary node change that would not shift heap usage to another node. While i see nothing of concern with what was shared in your post, the things you want to watch for is memory related logs of concern. Java out of memory (OOM) alerts indicate a problem that must be addressed. OOM can happen when your designed dataflow(s) try to consume more memory then is allocated to your JVM. Or is a sign that GC can keep up with the memory demand. Or heap usage exceeded 80% utilization and GC run was unable to free enough unused heap to get back below that 80% utilization. While not out of memory, this indicates your dataflow(s) use high active heap (common offenders are merge or split based processors with excessively high number of FlowFiles being merged in a single transaction or a single split producing an excessively large number of output split FlowFiles in a single transaction. The embedded documentation (usage docs) for the various components indicate if a component has the potential high heap or high CPU usage in the "System Resource Consideration" section. Here is example form MergeContent: Hope you find this information useful for your query. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-13-2024

@SAMSAL Thank you for the kind words. Likewise, the community thrives through members like yourself. Thank you for all your amazing contributions.

MattWho · ‎06-13-2024

@tcherian NiFi certificates must meet the following criteria: 1. No wildcards used in the subject DistinquishedName (DN) 2. Included both clientAuth and serverAuth in the ExtendedKeyUsage (EKU) 3. Contains one or more SubjectAlternativeName (SAN) entries. 4. Keystore can only contain 1 PrivateKey entry There are many resources on the web for generating your own self-signed certificates and adding them to a PKCS12 or JKS keystore. The "Keystore" and "truststore" are both just keystores. The NiFi "Keystore" contain the PrivateKey entry which Is used by NiFi to identify itself as the server (serverAuth) when connecting to it and as the client (clientAuth) when connecting outward as a client (such as talking to other NiFi's, NiFi-Registry, etc). The NiFi "truststore" contains one too many TrustedCert entries. It is common to use the default Java cacerts file (which is just a jks keystore) and add additional TrustedCert entries to it. The trustedCerts are the public certs that correspond to the PrivateKey that you should never share. The Trusted certs are the signers of the private keys. There are intermediate and root trusted cert keys. An intermediate trust is one where the owner and signer are not the same DN. A root trust is one where the owner and signer are the same DN. So you might create a PrivateKey that is signed by intermediate Certificate Authority (CA) and that intermediate CA would be signed by another intermediate CA or a root (CA). The chain of signers between intermediate and root is known as the trustchain. The Truststore needs to contain complete trust chains for your PrivateKey. There are even free services out there like Tinycert, but you can also use openssl and keystool to generate self-signed certificates and import them to a keystore. Just google how to create a certificate and how to import certificate into a keystore. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-11-2024

Yes, i believe this to be a legitimate NiFi bug.

MattWho · ‎06-11-2024

@tcherian I assume you are using the non-production ready NiFi out-of-the-box auto-generated keystore and truststore keystores files? If so, you should generate your own certificates that include the additional "host.docker.internal" and/or "nifi-container-name" SAN entries. Import that certificate into your own keystore and populated a truststore with the complete trust chain for your certificate. Something else you might want to try is to populate the the following property in the nifi.properties file: nifi.web.proxy.host=host.docker.internal,nifi-container-name But even if above works for you, i would still highly encourage you to get actual signed certificates instead. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-11-2024

@ranie I see a couple issues with your NiFi Expression Language (NEL) statement: I see some formatting issues in your java simple formatter string: 'yyyy-MM-dd\'T\'00:00:00\'Z\'. Your single and double quotes are not balanced. You are using the function "format ()" to change the timezone, but you could also use the "formatInstant()" function. You are missing the "toNumber()" function to convert the date string to a number before trying to apply a mathematically computation to it. The Now() function will return the date current system time as the NiFi service sees it. example: my NiFi server uses UTC timezone: The toNumber() function will provide the current date and time as a number of milliseconds since midnight Jan 1st, 1970 GMT. This number will always be a GMT value. The formatInstant() function will allow you to take a GMT time or a Java formatted date string and reformat it for a different timezone. Taking above feedback into consideration, the following NEL statement should work for you. ${now():toNumber():minus(86400000):formatInstant("yyyy-MM-dd'T'HH:mm:ss 'Z'", "CET")} Pay close attention to your use of single and double quotes. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-10-2024

@omeraran This use case sounds like a dataflow that would consist of the following: GenerateTableFetch --> ExecuteSQL --> <any processor you may want to modify, extract, etc content if needed) --> PutDatabaseRecord GenerateTableFetch will ingest rows from your source MySQL DB and maintain use NiFi state to record the maximum values for records so that it can continue to check for any ingest additional rows added. It generates FlowFiles that contain the SQL queries needed by the ExecuteSQL to fetch the rows. I don't know if your use case requires any manipulation, routing modifying, etc, but if so you would do that next. And finally use the PutDatabaseRecord to write your rows to the Oracle DB. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-10-2024

@udayAle Some NiFi Processors process FlowFiles one at a time and other may process batches of FlowFiles in a single thread execution. Then there are processors like the MergeContent and MergeRecord that allocate FlowFiles to bins and then only merges that bin once the min criteria is met to merge. With non merge type processors, a FlowFile that becomes results in a hung thread or long thread execution would block processing of FlowFiles next in queue. For Merge type processors, depending on data volumes and configuration 5 mins might be expected behavior (of your you could set a max bin age of 5 mins to force a bin to merge even if mins have not been satisfied). So i think there are two approaches to look at here. One monitors long running threads and the the other looks as failures. Runtime Monitoring Properties: When configured this background process checks for long running threads and produces log output and NiFi Bulletins when a thread exceeds a threshold. You could build an alerting dataflow around this using the SiteToSiteBulletinReportingTask, some Routing processors(to filter specific types of bulletins related to long running tasks) and then an email processor. The majority of processors that have potential for failures to occur will have a failure relationship. You can build a dataflow using that failure relationship to alert on those failures. Consider a failure relationship routed to an update attribute that use the advanced UI to increment a failure counter that then feeds a routeOnAttribute processor that handles routing base on number of failed attempts. After x number of failures it could send an email via putEmail. Apache NiFi does not have a background "Queued Duration" monitoring capability. Programmatically building one would be expensive resource wise. As you would need to monitor every single constantly changing connection and parse out and FlowFile with a "Queued Duration" in excess of X amount of time. Consider a Processor that is hung, the connection would continue to grow until backpressure kicks in and forces upstream processor to start queueing. You could end up with 10,000 FlowFiles alerting on queued duration. Hopefully this helps you maybe to look at the use case a little differently. Keep in mind that all monitoring including examples I provided will have impact on performance. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎05-18-2026 11:55 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎05-18-2026 11:55 PM
Posts	3,470
Kudos received	1637

Cloudera Community

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: FetchSMB not fetching all files

Re: Nifi: How to revoke the import and export Temp...

Re: Apache NiFi Canvas loading issue

Re: Error Securing NiFi Cluster with a Single Cert...

Re: NiFi high jvm heap utilization on primary node

Re: Apache Nifi Release 2.0 M1 & M2 High CPU Utili...

Re: How to access Nifi REST API 2.0.0 from a docke...

Re: Nifi: Flowfile stuck in front of a processor g...

Re: How to access Nifi REST API 2.0.0 from a docke...

Re: Date with now() in a specific format in a spec...

Re: Need Help About Apache NiFi

Re: Apache Nifi Queue monitoring and Alerting