About MattWho

MattWho · ‎07-01-2024

@kagesenshi Authentication and Authorization are two separate processes in NiFi. Authorization happens only after some method of authentication is successful resulting in an authorized user identity being passed to the NiFi authorizer for authorization verification. Based on what you have shared, authentication in your setup supports TLS clientAuth based authentication and ldap-provider based authentication (you may have additional methods enabled as well). Note: NiFi authentication and authorization is case sensitive Your ldap-provider is configured with "USE_USERNAME" which tells this provider to use whatever user identity string was typed by the user in the login UI. Upon successful authentication of your ldap user identity, the user identity entered by user is evaluated against the identity.mapping.pattern.<xyz> java regular expressions and if the java expression matches the associated identity.mapping.value.<xyz> and identity.mapping.transform.<xyz> properties are applied against that user identity, The resulting manipulated user identity is then passed to the NiFi authorizer. Within your authorizer.xml configuration file, NiFi has single authorizer and one or more user-group-providers.. The user-group-providers are used so that the authorizer is aware any groups that the user identity passed after authentication is member of. You are using the ldap-user-group-provider. Within that provider you configured the group membership enforce case sensitivity to false. This has nothing to do with authorization. It is used so that when users and groups associations are being determined from the ldapsearch results returned by the user sync and group sync, those matches are handle in an case insensitivity fashion. After user to group associations are made, the user identity string comes from "sAMAccountName" and group identities come from "name" (this is not a common ldap/AD group name field. "cn" and "sAMAccountName" are most common). The user identities returned by ldap are also evaluated against identity.mapping.<abc>.<xyz> properties just like was done during authentication. The group identities are evaluated against the group.mapping properties. While you can't change the case sensitive nature of NiFi, you can use identity mappings (user and group) to normalize the users identities and group identities (common to transform to all lowercase "LOWER"). This allows a user to enter their username in whatever case they want during login and have NiFi convert it to all lowercase in the background. See: Identity Mapping Properties for more on these properties set in the nifi.properties file. Custom properties can be added such as below to convert user identities to all lowercase: nifi.security.identity.mapping.pattern.username=^(.*?)$ nifi.security.identity.mapping.value.username=$1 nifi.security.identity.mapping.transform.username=LOWER Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-18-2024

@omeraran If your source is continuously being written to you might consider using the GenerateTableFetch processor --> ExecuteSQLRecord processor (configured to use JsonRecordSetWriter) --> PutDatabaseRecord processor. Working with multi-record FlowFiles by utilizing the record based processor is going to be a more efficient and performant dataflow. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-18-2024

@MikeH Sounds like you are regularly ingesting a considerable number of files fro your local filesystem. Is this a NiFi multi-node cluster or a single standalone instance of NiFi handling this use case? Both the GetFile and ListFile processors have a "Path Filter" property that takes a Java Regular expression. You could add multiple processors each with a different regex so they each get from a subset of user sub-directories. You might consider using the ListFile along with FetchFile processors instead of the GetFile processor. The ListFile processor produces zero byte FlowFiles (1 FlowFile for each file listed), this processor is then connected to a FetchFile processor which use attributes set on that source file to fetch the content and add it to the FlowFile. With a NiFi cluster this design approach allows you to redistributed the 0 byte FlowFiles across all nodes in a NiFi cluster so the heavy work of reading in the content and processing each FlowFile is spread across multiple servers(NiFi cluster nodes). With this approach you can also have many ListFile processor all feeding a single FetchFile. So perhaps you have a regex for all directories starting with A through C in one processor and another processor for D through F, etc... Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-18-2024

@NeheikeQ The first thing that stands out to me is that the version of NiFi-Registry you are using is not going to be compatible with the version of NiFi you are using. NiFi introduced numerous new capabilities that would get tracked in NiFi-Registry, but that old NiFi-Registry version is not going to handle them (even if version control works, the stored flow definitions are going to be missing elements). Every time a change is made on the NiFi canvas the current flow.xml.gz ad flow.json.gz files are archived and new versions created. So rolling back can be done by swapping to the archived flow.json.gz. There are a few bugs that have been addressed sine 1.23 that address not disconnection and failure to rejoin related bugs. There is not enough detail here to pinpoint an exact cause for your issue. Does your issue only happen with dataflow(s) imported from your old 0.7.0 version of NiFi-Registry? Any particular flow design that always reproduces yoru issue? Full error and stack traces from nifi-app.log? Any other errors or warns around same time in either nifi-app.log or nifi-user.log? What about the nifi-request.log, what was the request made at time the exception occurs? I recommend upgrading to the latest 1.x release of Apache NiFi and to latest NIFi-Registry version to see if your issue persists. Note: The flow.xml.gz is deprecated. It has been replaced by the flow.json.gz. When NiFi is started it will load the flow.json.gz. If the flow.json.gz does not exist, NiFi will load from the flow.xml.gz file at which time it will generate the flow.json.gz file from it. Apache NiFi 1.16+ will still write out both the flow.xml.gz and flow.json.gz files whenever a changes is made on the UI. In Apache NiFi 2.x versions, only the flow.json.gz will exist. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@Alexy Without seeing your logs, I have no idea which NiFi classes are producing the majority of your logging. But logback is functioning exactly as you have it configured. Each time the nifi-app.log reaches 500 MB within a single day it is compressed and rolled using an incrementing number. I would suggest changing the log level for the base class "org.apache.nifi" from INFO to WARN. The bulk of all NiFi classes begin with org.apache.nifi and by changing this to WARN to you will only see ERROR and WARN level log output from the bulk of the ora.apache.nifi.<XYZ...> classes. <logger name="org.apache.nifi" level="WARN"/> Unless you have a lot of exception happening within your NiFi processor components used in your dataflow(s), this should have significant impact on the amount of nifi-app.log logging being produced. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@SAMSAL Looking at output provided, you appear to running your Apache NiFi on Windows. It appears this issue was raised 2 days ago against M3 in Apache Jira here: https://issues.apache.org/jira/browse/NIFI-13394 It is currently unresolved. You can certainly create an Apache jira account and add additional comments to this jira with your detailed findings. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@SAMSAL This is some great detail. I believe you are hitting this bug that has been fixed for the next 2.0.0 milestone release (M4): https://issues.apache.org/jira/browse/NIFI-13329 There will eventually be a 2.0.0 RC release. That will be the first official Release Candidate for new 2.x versions that will follow all these development milestone releases. You can create an Apache Jira account that would give you the ability to raise new issues you find directly in the Apache NIFi project. This is best way to bring your finding to the developer community. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@Pratyush1 The more detail you can provide, the better here. In addition to what @cassie2698bratt suggested: Apache NiFi 1.14 was released 3 years ago and has had many important bug fixed, security fixes, and improvements since then, The latest 1.x release is Apache NiFi 1.26 as of writing this response. I strongly recommend upgrading to the latest release. What Java vendor and version are you using with your NiFi. NIFi supports Java 8 (and 11 in newer releases). Java 8 update 252 or newer is required minimum. All nodes consistently on same version? What OS is being used on server where NiFi is having issue? Is this a NiFi multi-node cluster? Does UI of every node in same cluster present same issue while loading the UI or is it specific to just one node in the cluster? Any custom code add-ons? Are the other systems using the same Java and NiFi versions? Any observations in the NiFi logs when accessing the UI? Thank you, Matt

MattWho · ‎06-14-2024

@helk You can use a single certificate to secure all your nodes, but i would not recommend doing so for security reasons. You risk compromising all your host if any one of them is compromised. Additionally NiFi nodes act as clients and not just servers. This means that all your hosts will identify themselves as the same client (based off DN). So tracking client initiated actions back to a specific node would be more challenging. And if auditing is needed, made very difficult. The SAN is meant to be used to differently. Let's assume you host an endpoint searchengine.com which is back by 100 servers to handle client requests. When a client tries to access searchengine.com that request may get routed to anyone of those 100 servers. The certificate issues to each of those 100 servers is unique to each server; however, every single one of them will have the searchengine.com as an additional SAN entry in addition to their unique hostname. This allows the host verification to still be successful since all 100 are also known as searchengine.com. Your specific issue based on shared output above is caused by the fact that your single certificate does not have "nifi01" in the list of Subject Alternative Names (SAN). It appears you only added nifi02 and nifi03 as SAN entries. The current hostname verification specs no longer use DN for hostname verification. Only the SAN entries are used for that. So all names(hostnames, common names, IPs) that may be used when connecting to a host must be included in the SAN list. NiFi cluster keystore requirements: 1. keystore can contain only ONE privateKeyEntry. 2. PrivateKey can not use wildcards in the DN. 3. PrivateKey must contain both clientAuth and serverAuth Extended Key Usage (EKU). 4. Privatekey must contain at least one SAN entry matching the hostname of server on which keystore will be used. The NiFi truststore must contain the complete trust chain for your cluster node's PrivateKeys. On truststore is typically copied to and used on all nodes. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎06-14-2024

@Dave0x1 Some general related information: 1. Java uses heap as needed, but for efficiency does not run Garbage Collection (GC) to free unused heap until typically over 80% allocated heap usage. So not unexpected to see heap utilization of 70% even once data is processed out of your dataflow(s). So there is nothing unexpected or alarming specific to that 70% heap utilization in itself. You probably want to look at the GC events (partial and full GC) to see how many and how often they are happening. What are your current XMS and XMX heap setting for your NiFi. Heap is requested during execution of NiFi components. NiFi does not manage the heap or its clean-up, that is a process handled by Java. 2. When a component is configured with "primary node" execution, it will only be scheduled on the currently elected primary node. The FlowFiles generated will then only exist on the primary node unless you design into your dataflow(s) redistribution (typically done via load balance configuration on downstream connection of the primary node execution processor component) of those FlowFiles across all your nodes for further downstream processing. Even with distribution, there will be some deviations in resource usage since you are still doing some additional work on just the primary node. 3. The primary node and cluster coordinator nodes are elected by Zookeeper (ZK) and can change. Commonly there is some event that triggers a change (current primary node stops heart-beating to ZK, current primary node disconnects from cluster, cluster or primary node is restarted, current primary node shutdown, etc.. You could look at the individual node events in the cluster UI to see when the primary node change to see if aligns with any of these event types. But even with a primary node change that would not shift heap usage to another node. While i see nothing of concern with what was shared in your post, the things you want to watch for is memory related logs of concern. Java out of memory (OOM) alerts indicate a problem that must be addressed. OOM can happen when your designed dataflow(s) try to consume more memory then is allocated to your JVM. Or is a sign that GC can keep up with the memory demand. Or heap usage exceeded 80% utilization and GC run was unable to free enough unused heap to get back below that 80% utilization. While not out of memory, this indicates your dataflow(s) use high active heap (common offenders are merge or split based processors with excessively high number of FlowFiles being merged in a single transaction or a single split producing an excessively large number of output split FlowFiles in a single transaction. The embedded documentation (usage docs) for the various components indicate if a component has the potential high heap or high CPU usage in the "System Resource Consideration" section. Here is example form MergeContent: Hope you find this information useful for your query. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Online
Last Visited	‎01-15-2026 10:20 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎01-15-2026 10:20 AM
Posts	3,421
Kudos received	1624

Cloudera Community

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: nifi login case sensitivity

Re: Need Help About of Apache NIFI Data Migration ...

Re: Limit number of files fetched by directory

Re: NiFi node disconnection from Cluster + Diff in...

Re: Nifi Logrotation Policy

Re: Python Extension Processors In M3 release sti...

Re: M3 Release Bug : HTTP ERROR 500 Content prepa...

Re: Apache NiFi Canvas loading issue

Re: Error Securing NiFi Cluster with a Single Cert...

Re: NiFi high jvm heap utilization on primary node