Member since
07-30-2019
3427
Posts
1632
Kudos Received
1011
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 85 | 01-27-2026 12:46 PM | |
| 494 | 01-13-2026 11:14 AM | |
| 1032 | 01-09-2026 06:58 AM | |
| 921 | 12-17-2025 05:55 AM | |
| 982 | 12-15-2025 01:29 PM |
10-24-2017
03:38 PM
@Ravi Papisetti The specific mismatch is between the authorizations.xml on the disconnected node and the authorizations being used by the cluster coordinator. The Primary node plays no role in this process. You say that this happens often? Is the the same reason given every time (is it always because of authorizations mismatch)? The authorizations.xml file gets updated anytime an access policy is added, removed, or modified. For some reason the authorizations.xml file is not being updated on this node. Verify proper ownership and permission on the users.xml, authorizations.xml, and flow.xml.gz files and containing directories. The user that owns the NiFi process must be able to read, and write these files. If ownership is not an issue, you will want to check your nifi-user.log for any issues when replication requests are being made. This occurs when a change is made while logged in to any cluster node. The change must be replicated to all nodes and there may be a authentication/authorization issue preventing this node from updating. Thanks, Matt
... View more
10-24-2017
01:41 PM
@basant sinha * HCC Tip: Don't respond to an answer via another answer. Added comments to existing answer unless you are truly starting a new answer. Is the directory from which ListFile is listing files a mounted directory on all your nodes or only exists on just one node in your cluster? Keep in mind when using "primary node" only scheduling that the primary node can change at any time. -------- If it exists only on one node, you have a single point of failure in your cluster should that NiFi node go down. To avoid this single point of failure: 1. You could use NiFi to execute your script on "primary node" every 15 minutes assuming the directory the script is writing to is not mounted across all nodes. Then you could have ListFile running on all nodes all the time. Of course only the the listFile on any one given node will have data to ingest at any given time since your script will only be executed on the currently elected primary node. 2. Switch to using listSFTP and FetchSFTP processors, That way no matter which node is the current primary node, it can still connect over SFTP and list data. The ListSFTP processor maintains cluster wide state for this processor in Zookeeper, so when Primary node changes it does not start over from beginning. ------- The RPG is very commonly used to redistribute FlowFile within the same NiFi cluster. Check out this HCC article that covers how load-balancing occurs with a RPG: https://community.hortonworks.com/content/kbentry/109629/how-to-achieve-better-load-balancing-using-nifis-s.html Thank you, Matt
... View more
10-23-2017
04:06 PM
2 Kudos
@Matt Burgess @Anders Boje NiFi-3867 is included in HDF 3.0.0 baseline: https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.0/bk_release-notes/content/ch_hdf_relnotes.html While HDF 3.0.0 is based off Apache NiFi 1.2.0, it included many additional fixes/enhancements that later went in to Apache NiFi 1.3.0. The above release notes show all additional Apache Jiras that were included on top of everything Apache NiFi 1.2.0 that makes up the HDF 3.0.0 release. Thanks, Matt
... View more
10-23-2017
02:41 PM
@basant sinha I am not sure I am completely following your use case. With a NiFi cluster configuration it does not matter which nodes UI you access, the canvas you are looking at is what is running on all nodes in your cluster. The Node(s) that are the currently elected cluster coordinator and/or primary node may change at any time. A Remote Process Group (RPG) is used to send or retrieve FlowFiles from another NiFi instance/cluster. It is not used to list files on a system. So I am not following you there. When it comes to configuring a RPG, not all fields are required. - You must provide the URL of the target NiFi instance/cluster (with a cluster, this URL can be anyone of the nodes). - You must choose either "RAW" (default) or "HTTP" as you desired Transport Protocol. No matter which is chosen, the RPG will connect to the target URL over HTTP to retrieve Site-To-Site (S2S) details about the target instance/cluster. (number of nodes if cluster, available remote input/output ports, etc...). When it comes to the actual data transmission: ----- If configured for "RAW" (which uses a dedicated S2S port configured in the nifi.properties property nifi.remote.input.socket.port on target NiFi), data will be sent over that port. ----- If configured for "HTTP", data will be transmitted over same port used in the URL for the target NiFi's UI. (This requires that nifi.remote.input.http.enabled property in nifi.properties file is set to "true") The other properties are optional: - Configure the Proxy properties if an external proxy server is sitting between your NiFi and the target NiFi preventing any direct connection between your NiFi instance over the above used configured ports (defaults: unset). Since NiFi RPG will be sending data directly to each target node (if target is cluster), none of the NiFi nodes themselves are acting as a proxy in this process. - Configure the Local Network Interface, if your NiFi nodes have multiple network ports and you want to force your RPG to only use a specific interface (default: unset). Additional documentation resources from Apache: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#site_to_site_properties If you find this answer addresses your question, please take a moment to click "Accept" below the answer. Thanks, Matt
... View more
10-23-2017
12:28 PM
@Bilel Boubakri NiFi only reads the nifi.properties file on startup. so currently edits to this file cannot be done during runtime. A restart is going to be required. The only configuration file that is regularly read by NiFi during runtime is the logback.xml file. Thanks, Matt
... View more
10-20-2017
07:53 PM
@Shu @dhieru singh By default their is no guaranteed order in which FlowFiles are pulled from he queue feeding any given processor. This is because NiFi favor performance over order. If you want enforce some sort of order in which FlowFiles are pulled from a inbound queue, you must add a "Prioritizer" to the inbound connection. By default, no prioritizers are added. To apply a prioritizer, simply drag the desired prioritizer(s) to the "Selected Prioritizers" box. Regardless of strategy used in your DistributeLoad processor (round Robin or next available), There will not be a continuos order to the FlowFiles queued to either MergeContent processor. Thanks, Matt
... View more
10-20-2017
07:39 PM
@dhieru singh It may be helpful to understand your entire use case here. There is no guaranteed order in which FlowFile are merged regardless of whether one MergeContent or Multiple MergeContent processors are used. With your setup the distributeLoad processor with round robin FlowFiles from its incoming queue to its two outbound connections feeding your individual MergeContent processors. Each of those MergeContent processors will generate its own resulting merged FlowFile. One MergeContent processor with 2 concurrent tasks will perform the same as 2 MergeContent with 1 concurrent task each. If your goal here is to control heap usage by your mergeContent processors, you may want to use two MergeContent processors in series rather then in parallel. Thank you, Matt
... View more
10-20-2017
04:34 PM
2 Kudos
@Bilel Boubakri The same concept applies for sending from NiFi to MiNiFi. The RPG can be used to push Flowfiles (as shown in the above screenshots), but can also be used to pull FlowFiles from a Remote Output port. Thanks, Matt
... View more
10-20-2017
12:10 PM
@Gerd Koenig I edited my response to be more clear. While Ranger is supported, the use of Ranger Groups is not. Thanks, Matt
... View more
10-19-2017
09:58 PM
@dhieru singh FlowFiles generated by ListenUDP are placed on the outbound connection. One of the easiest ways to see the sizes of those FlowFiles is to right click on that connection (while it has queued data) and select "list queue" from the context menu that is displayed. It will open a new UI that will list all FlowFiles queued on that connection along with their details. Matt
... View more