Member since
07-30-2019
3471
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 148 | 06-03-2026 06:06 PM | |
| 459 | 05-06-2026 09:16 AM | |
| 826 | 05-04-2026 05:20 AM | |
| 495 | 05-01-2026 10:15 AM | |
| 621 | 03-23-2026 05:44 AM |
02-07-2018
02:52 PM
@Felix Albani Thank you for your feedback... I have made the correction.
... View more
02-07-2018
02:51 PM
1 Kudo
@Arun A K Once NiFi is secured access to any resource in NiFi requires authentication and authorization. It is not required but highly recommended. When you go to add a new policy in ranger, you need to provide the exact "Nifi Resource identifier". If The Ranger user is able to communicate with NiFi securely, it can retrieve that list of resources and the user can just select from the list: Otherwise, no list will be returned and you must know the exact string to enter. So you do not need to SSL enable Ranger to accomplish this. NiFi can still communicate with an non secured ranger to retrieve authorizations. But you can configure Ranger to talk to a secured NiFi to obtain the resource listing: As seen above, the keystore and truststore configured her in Ranger will be used to connect to the nifi resources rest api endpoint to retrieve that resource listing. The keystore and truststore must be owned by the ranger user. The PrivateKeyEntry in the keystore must be trusted by the target NiFI truststore and the truststore used here must be able to trust the server certificate returned by the Target Nifi instance. Thanks, Matt
... View more
02-06-2018
01:35 PM
@John T First, let's make sure we are on the same page with terminology. FlowFile --> A FlowFile is something very specific to NiFi. Each "FlowFile" consists of two parts, the FlowFile Attributes/Metadata and the FlowFile content. In most cases, processors do not send "FlowFiles", but only send the FlowFile's content. The postHTTP has a option property to (send as FlowFile) which will allow it to send a complete FLowFile to a ListenHTTP processor running on another NiFi instance. NiFi's Site-To-Site capability (uses Remote Process Group and Remote input/output ports) to send "FlowFiles" between Nifi instances. The only other method to send a "FlowFile" between NiFi instances is to use the MergeContent processor configured with a "Merge Format" that will package up the complete FlowFile (FlowFile Stream or FlowFile tar) in the content. This packaged FlowFile can then be sent to another NiFi and the UnPackContent processor can be used to separate the original FlowFile Content from FlowFile Attributes (load original content on content repo and original Attributes in to FlowFile repo) Simply increasing the batch size to a large value here will not guarantee the original content is re-created in destination FlowFile. Since content is streaming in to the socket buffer and NiFi's ListenTCP processor simply calls the channel Reader to ingest from the buffer line-by-line, you may end up with lines from one source's content in the same FlowFile as lines of content from a different source File. For a little more info on the NiFi ListenTCP based processor: https://community.hortonworks.com/articles/30424/optimizing-performance-of-apache-nifis-network-lis.html If you end goal is to send complete FlowFile content, consider using a different transfer method from putTCP and listenTCP. The listenTCP processor is designed in such a way it assumes each line being read from teh socket buffer is its own incoming message. Thank you, Matt
... View more
02-05-2018
05:06 PM
1 Kudo
@John T The GenerateFlowFile processor is capable of creating multi-line content in a FlowFile. Even after passing that content through the Base64EncodeContent Processor the content will remain multi-lined. The putTCP processor is then pushing the multi-line content in to the internal socket buffer of the ListenTCP processor. The ListenTCP processor will then read those lines and create new FlowFiles in which they will be placed based on the configured batch size. In your case you have a batch size of only 1, so each created FlowFile will only contain a single line each. This would explain why you see so many little FlowFIles being generated on your receiving side. If you are just trying to sedn a FlowFile from one NiFi to another, why not use NiFi Site-To-SIte or the PostHTTP --> ListenHTTP processors (These two methods can send actual "FlowFiles" between NiFi instances)? Thank you, Matt If you found this answer addressed your question, please take a moment to click the "accept" link below.
... View more
01-23-2018
02:28 PM
@Shashwat Gaur The overall throughput of NiFi is not being limited in any way at the NiFi software level. In most cases throughput is limited by CPU, Disk I/O, Memory, and/or network performance. I would check if any of the above are saturated. It is important that installation best practices are followed to maximize your throughput. At a minimum having the following located on separate physical disks (disks should be setup as RAIDs to protect your data) will help: - Content repository(s) - FlowFile repository - Provenance repository(s) - NiFI logging directory. When it comes to controlling throughput in your dataflow, look for bottleneck in your dataflow and check that you have optimized your processor components for concurrent tasks and run schedules. If your CPU is not saturated, consider increasing the number of configured threads you are allowing NiFi to hand out to its processor components in the "controller settings" (found under hamburger menu in upper right corner of NiFi UI). Change the value for "Max Timer Driven Thread Count". Good starting place is 2 - 4 times number of cores on a single NiFi instance (all settings are per node in a cluster). There is also a setting for "Max Event Driven Thread Count" which should be left unchanged. These event driven threads are experimental and not used by any NiFi components by default. If you find a lot of Garbage Collection is going on or you are hitting OutOfMemory(heap) exceptions, you may need to increase your heap allocation in the nifi bootstrap.conf file. You may also need to make dataflow design changes to reduce the heap footprint of your flow. Thank you, Matt
... View more
01-19-2018
02:35 PM
2 Kudos
@Andrew Twigg Make sure that your keystore and certs meet the following: - The keystore file used on each server contains only a single PrivateKeyEntry. - The certificate in the keystore has an extended key usage that includes both client auth and server auth Thank you, Matt
... View more
01-15-2018
09:34 PM
1 Kudo
@dhieru singh NiFi offers a "Summary" UI which has a connections tab you can select. Once selected you can sort on the "Queue / Size threshold" column by clicking on the "Queue" word". This will move your connections with the largest threshold percentage to the top of the list. 100% for Queue threshold indicates object threshold back pressure is being applied on that connection. 100% for Size threshold indicates that Queue Size back pressure is being applied on that connection. Clicking on the error to the far right of the row will take you directly to hat connection on your canvas no matter which process group it exists within. Thank you, Matt If you found this answer to be helpful in addressing your question, please take a moment to click the accept link below.
... View more
11-22-2017
04:10 PM
@Pratik Ghatak The NiFi instance/cluster with the Remote Process Group (RPG) is acting as the client and the target Nifi instance/Cluster is acting as the server in this Site-To-Site (S2S) connection. Your Target NiFi configuration you shared shows taht you have S2S setup to support both the RAW and HTTP transport protocols. It also appears from your screenshots that the initial connection between your two NiFis is working correctly. The error you are seeing indicates that you have not added any remote input ports on the target NiFi's canvas to receive a FlowFIles from the source NiFi. On the target NiFi, you will need to add one or more "input ports": Input ports added at the root/top level process group are considered "Remote input ports" and can be used to receive data over S2S. After you add these "remote input ports" to your target Nifi's canvas, you can right click on your source NiFi RPG and select "refresh" from the context menu that appears (or you can just wait for next auto-refresh at 30 seconds). Now you should be able to see those remote input ports when you drag a connection to the RPG. Thank you, Matt If you find this information has addressed your question/issue, please take a moment to click "Accept" beneath the Answer.
... View more
10-24-2017
01:41 PM
@basant sinha * HCC Tip: Don't respond to an answer via another answer. Added comments to existing answer unless you are truly starting a new answer. Is the directory from which ListFile is listing files a mounted directory on all your nodes or only exists on just one node in your cluster? Keep in mind when using "primary node" only scheduling that the primary node can change at any time. -------- If it exists only on one node, you have a single point of failure in your cluster should that NiFi node go down. To avoid this single point of failure: 1. You could use NiFi to execute your script on "primary node" every 15 minutes assuming the directory the script is writing to is not mounted across all nodes. Then you could have ListFile running on all nodes all the time. Of course only the the listFile on any one given node will have data to ingest at any given time since your script will only be executed on the currently elected primary node. 2. Switch to using listSFTP and FetchSFTP processors, That way no matter which node is the current primary node, it can still connect over SFTP and list data. The ListSFTP processor maintains cluster wide state for this processor in Zookeeper, so when Primary node changes it does not start over from beginning. ------- The RPG is very commonly used to redistribute FlowFile within the same NiFi cluster. Check out this HCC article that covers how load-balancing occurs with a RPG: https://community.hortonworks.com/content/kbentry/109629/how-to-achieve-better-load-balancing-using-nifis-s.html Thank you, Matt
... View more
10-23-2017
02:41 PM
@basant sinha I am not sure I am completely following your use case. With a NiFi cluster configuration it does not matter which nodes UI you access, the canvas you are looking at is what is running on all nodes in your cluster. The Node(s) that are the currently elected cluster coordinator and/or primary node may change at any time. A Remote Process Group (RPG) is used to send or retrieve FlowFiles from another NiFi instance/cluster. It is not used to list files on a system. So I am not following you there. When it comes to configuring a RPG, not all fields are required. - You must provide the URL of the target NiFi instance/cluster (with a cluster, this URL can be anyone of the nodes). - You must choose either "RAW" (default) or "HTTP" as you desired Transport Protocol. No matter which is chosen, the RPG will connect to the target URL over HTTP to retrieve Site-To-Site (S2S) details about the target instance/cluster. (number of nodes if cluster, available remote input/output ports, etc...). When it comes to the actual data transmission: ----- If configured for "RAW" (which uses a dedicated S2S port configured in the nifi.properties property nifi.remote.input.socket.port on target NiFi), data will be sent over that port. ----- If configured for "HTTP", data will be transmitted over same port used in the URL for the target NiFi's UI. (This requires that nifi.remote.input.http.enabled property in nifi.properties file is set to "true") The other properties are optional: - Configure the Proxy properties if an external proxy server is sitting between your NiFi and the target NiFi preventing any direct connection between your NiFi instance over the above used configured ports (defaults: unset). Since NiFi RPG will be sending data directly to each target node (if target is cluster), none of the NiFi nodes themselves are acting as a proxy in this process. - Configure the Local Network Interface, if your NiFi nodes have multiple network ports and you want to force your RPG to only use a specific interface (default: unset). Additional documentation resources from Apache: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#site_to_site_properties If you find this answer addresses your question, please take a moment to click "Accept" below the answer. Thanks, Matt
... View more