Member since
07-30-2019
3436
Posts
1632
Kudos Received
1012
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 144 | 01-27-2026 12:46 PM | |
| 564 | 01-13-2026 11:14 AM | |
| 1236 | 01-09-2026 06:58 AM | |
| 1018 | 12-17-2025 05:55 AM | |
| 500 | 12-17-2025 05:34 AM |
04-12-2021
05:35 AM
@Masi The exception does not appear related to Load Balance connections in NiFi. LB Connections utilize NiFi S2S in the background which does not use MySQL. Matt
... View more
04-05-2021
06:18 AM
@VidyaSargur thank you for the recommendation. I already started a new thread https://community.cloudera.com/t5/Support-Questions/trying-to-assign-JMS-Broker-URI-property-value-at-runtime/td-p/313989 which is almost the same question as this thread. Just wanted to check if @sy_robert found any workaround over this.
... View more
04-01-2021
02:42 PM
@nmargosian The swap file in questions would contain FlowFiles that belong to a connection with the UUID of: 7cde3c5c-016b-1000-0000-00004c82c4b2 From your Flow Configuration history found under global menu icon in upper right corner, can you search for that UUID to see fi there is any history on it? - Do you see int existing at some point in time? Do you see a "Remove" event on it? - If you see it in history, but there is no "Remove" action, but it is now gone, then the flow.xml.gz loaded on restart did not have this connection in it. If this connection no longer exists in the canvas, NiFi can not swap these FlowFiles back in. Everything you see on the canvas resides in heap memory and is also written to disk within a flow.xml.gz file. When you stop and start or restart NiFi, NiFi loads the flow back in to heap memory from the flow.xml.gz (each node has a copy of this flow.xml.gz and all nodes must have matching flow.xml.gz files or nodes will not rejoin the cluster. Things I suggest you verify... 1. Make sure that NiFi can successfully write to the directory where the flow.xml.gz file is located. Make a change on the canvas am verify the existing flow.xml.gz was moved to the archive directory and a new flow.xml.gz was created. If this process fails then when NiFi is restarted any changes you made would be lost. For example the connection was created and data was queued on it, but NiFi failed to write new flow.xml.gz because it could not archive current flow.xml.gz (space issues, permissions/ownership issues...etc). This would block NiFi from creating a new flow.xml.gz, but the flow in memory would have your current flow still. All these directories and files should be owned and readable/writable by your NiFi service user. 2. Did some point in history did your cluster nodes fllows mismatch. For example, a change was made on the canvas of a node that was currently disconnected from the cluster. Then that nodes flow was copied to the other nodes to make all nodes in sync. 3. Was an archived flow reloaded back to NiFi at some point. This requires manual user action to copy a flow.xml.gz out of archive and used to replace the existing flow.xml.gz. NiFi restarts will not just remove connections from your dataflows. Some other condition occurred and it may not have even been recent. If you hav enough app.log history covering multiple restarts, do you see this same exact warn log line with each of those restarts. Hope this helps, Matt
... View more
03-29-2021
11:35 AM
@Garyy You are correct. Since NiFi does not use sessions as mentioned in my last response, the client must authenticate every action performed. When you "login" to NiFi, the result is a bearer token being issued to the user which your browser stores and reuses in all subsequent request to the NiFi endpoints. At the same time a server side token for your user is also stored on the specific NIFi node you logged in to. The configuration in your NiFi login provider dictates how long those bearer tokens are good for. With your setting of 1 hour, you would be forced to re-login again every hour. Thanks, Matt
... View more
03-29-2021
11:30 AM
@vi The more details you provide, the more likely you are to get responses in the community. Since i know you are dealing with GetFTP and files being consumed by that processor eating away at your limited network bandwidth, I can offer the following feedback: I assume the ~60 GB of files consumed by your GetFTP every hour is many files? The GetSFTP processor is deprecated in favor of the ListSFTP --> FetchSFTP processor design. SFTP protocol is not a cluster friendly protocol for a NiFi cluster (and you should always have a NiFi cluster for redundancy and load handling). Running the GetSFTP or ListSFTP on all nodes in the cluster would result in every node competing fo the same files. These processor would always be scheduled for "primary node" only (primary node option does not exist in a standalone NiFi setup). The ListSFTP processor does not return the content of the listed files from the SFTP processor. It simply generates a list of files that need to be fetched from the target SFTP server. Each of those listed files becomes its own FlowFile in NiFi. The ListSFTP is then connected to a FetchSFTP processor which will fetch the content for each of the FlowFiles produced by the ListSFTP. The connection between the ListSFTP and FetchSFTP processor would be configured to load balance the FlowFiles to all nodes in your cluster. This spread out the work load of returning that content across all your cluster nodes. While there is not configuration option in the GetSFTP or FetchSFTP processor to limit bandwidth (feel free to open an apache NiFi Jira in the community for such an improvement), the listSFTP to FetchSFTP processor does give you some control. You can configure the run schedule on the FetchSFTP to some value other then default 0 secs (which means run as often as possible) to some other value which would place a pause between each execution (between each FlowFile fetching its content). While the fetch of the Content will still happen as fast as allowed, this would place a break between each fetch giving other operations time on your constrained network. Hope this helps, Matt
... View more
03-24-2021
06:09 AM
@dzbeda Have you tried specifying the index in your configured Query within the GetSplunk NiFi processor component? NiFi is not going to be able to provide you a list of indexes from your Splunk to choose from, you would need to know what indexes exist in your Splunk server. Hope this helps, Matt
... View more
03-17-2021
06:02 AM
@sambeth NiFi authorization lookups require an exact case sensitive match between the resulting authentication user string or associated group string (NIFi user group providers configured in the authorizers.xml are responsible for determining associations between user strings and group strings within NiFi) and the user/group strings which the the authorized policies are assigned to. So if the User identity string that results after Authentication process is: "CN=John, OU=Doe", then that exact case sensitive string must be what the policies are authorized against. NiFi does provide the ability to use Java regular expressions post authentication to manipulate the authentication string before it is passed on for authorization. These Identity mapping pattern, value, and transform properties can be added to the nifi.properties file. https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#identity-mapping-properties For example: nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?)$ nifi.security.identity.mapping.value.dn=$1 nifi.security.identity.mapping.transform=LOWER Now if Authentication resulted in string "CN=John, OU=Doe", the above Regex would match and the resulting user client string would be "john" (capture group 1 is value used and transformed to all lowercase) You can create as many of these mapping pattern sets of properties as you like as long as each property name is unique in its last field... nifi.security.identity.mapping.pattern.dn2=
nifi.security.identity.mapping.pattern.kerb=
nifi.security.identity.mapping.pattern.kerb2=
nifi.security.identity.mapping.pattern.username=
etc.... IMPORTANT note: These "patterns" are evaluate against every authenticated string (this includes Mutual TLS authentication such as those between NIFi nodes using the NiFi keystore) in alpha-numeric order. The first java regular expression to match will have its value applied and transformed. So making sure you properties are build in order of most complex regex to most generic regex is very important. Hope this helps you, Matt
... View more
03-15-2021
04:04 PM
2 Kudos
@sambeth The hash (#) character is reserved as a delimiter to separate the URI of an object from a fragment identifier. Registry has a number of different fragment Identifiers. The fragment identifier represents a part of, fragment of, or a sub-function within, an object. The fragment identifier follows the "/#/" in the URL and can represent fragments in text documents by line and character range, or in graphics by coordinates, or in structured documents using ladders. For example the "grid-list" of flows displayed when you access the NiFi UI. No, you cannot remove the # from the URL. Are you encountering an issue? Hope this helps, Matt
... View more
03-15-2021
03:35 PM
@alexwillmer NiFi does not support using wildcards in all scenarios. Access decisions would include authorization against specific endpoints. Not access decisions that may not work with wildcards may include some buttons remaining greyed out. So if you encounter a NiFi Resource Identifier is not giving you the expected result with a wildcard, try setting the policy explicitly and see if desired outcome is observed. The following article provides insight in to the expected access provided by each NiFi Resource Identifier: https://community.cloudera.com/t5/Community-Articles/NiFi-Ranger-based-policy-descriptions/ta-p/246586 NiFi actually downloads the policy definitions from Ranger and all authorizations are done based on the last downloaded set of policies (NiFi runs a background thread to check for updated policy definitions from Ranger). NiFi does not send a request to verify authorization to Ranger itself. Hope this helps, Matt
... View more
03-09-2021
10:01 AM
@nishantgupta101 There is no reason you could not write your own custom script that connects to a FTPS endpoint to retrieve a file which can be called via the ExecuteStreamCommand processor. There are also other script based processors that you can use.
... View more