Member since
07-30-2019
3471
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 148 | 06-03-2026 06:06 PM | |
| 459 | 05-06-2026 09:16 AM | |
| 826 | 05-04-2026 05:20 AM | |
| 495 | 05-01-2026 10:15 AM | |
| 621 | 03-23-2026 05:44 AM |
02-20-2018
08:37 PM
Hi @Matt Clarke Thanks for taking the time to answer my question and confirm that the size of the queued content doesn't include any archived content. I've taken a closer look at the status history of the connection (queue) over the past 24 hours and I can see that the nature of the data flowing in varies depending on the time of day. It seems that earlier in the day a large number of small flow files pass into the queue, but they can be processed rapidly. As the day goes on we start to see flow files of a much larger size. I think this is the explanation as to why the number of flow files decreases but the size of the data increases. This is something I hadn't expected until I looked at the status history closely. Thanks again for helping me get to the bottom of this! Richard
... View more
01-23-2018
02:28 PM
@Shashwat Gaur The overall throughput of NiFi is not being limited in any way at the NiFi software level. In most cases throughput is limited by CPU, Disk I/O, Memory, and/or network performance. I would check if any of the above are saturated. It is important that installation best practices are followed to maximize your throughput. At a minimum having the following located on separate physical disks (disks should be setup as RAIDs to protect your data) will help: - Content repository(s) - FlowFile repository - Provenance repository(s) - NiFI logging directory. When it comes to controlling throughput in your dataflow, look for bottleneck in your dataflow and check that you have optimized your processor components for concurrent tasks and run schedules. If your CPU is not saturated, consider increasing the number of configured threads you are allowing NiFi to hand out to its processor components in the "controller settings" (found under hamburger menu in upper right corner of NiFi UI). Change the value for "Max Timer Driven Thread Count". Good starting place is 2 - 4 times number of cores on a single NiFi instance (all settings are per node in a cluster). There is also a setting for "Max Event Driven Thread Count" which should be left unchanged. These event driven threads are experimental and not used by any NiFi components by default. If you find a lot of Garbage Collection is going on or you are hitting OutOfMemory(heap) exceptions, you may need to increase your heap allocation in the nifi bootstrap.conf file. You may also need to make dataflow design changes to reduce the heap footprint of your flow. Thank you, Matt
... View more
01-15-2018
09:40 PM
@Matt Clarke Thanks a lot appreciate your help
... View more
09-03-2018
01:57 PM
After you create multiple input ports in the NiFi, when you link your processor with your RPG, it will promote for you to choose which input port to use with selection list "to input".
... View more
10-25-2017
05:56 AM
@Matt Clarke Thanks for the reply, appreciate it. In my case directory from which files will be listed exist only on one node. For now I am trying to implement with ListSFTP and FetchSFTP processors and hope it works fine. Thank you once again for your valuable suggestions. Thanks, Basant
... View more
09-23-2018
01:47 PM
Matt, In the evaluation I'm doing NiFi versus StreamSets as a user, I prefer NiFi due to its richest pipeline management and clustering option in opensource version, and a lot of other things. The question posted by dhieru singh make enormous sense, in a perspective that a lot of critical systems (or even not critical ones) do this. I mean, if you accidentally stop all process groups at once and then, as soon you noticed that, restart all process again (imagine a DevOps guy waking up at 3:00AM 🙂 ) you may also start process groups that you don't want. This would be the second accident. It happened with me during evaluation, where I had built several process groups and some of them were intentionally suspended (ok, good practice says if you don't use, remove it, but, sure I'm not the only one when it comes to testing). Anyway, maybe is a good practice/idea to notify the user when he/she is about to stop the whole thing or, as you mention, add the ability to "lock" the current running state of a given process group. Thank you ! Julio
... View more
06-06-2019
12:53 AM
Hi @Matt Clarke, Does the table you shared above hold true today as well? Apache Nifi Crash Course video on https://www.youtube.com/watch?v=fblkgr1PJ0o mentions the same that a cluster should preferably have a single digit number but if really needed you can rather have 2 separate clusters with 10 nodes each and establish a sync between them. All I am trying to understand is with the latest version it still holds true and 10 nodes are still good to hold hundreds of thousands of events per second? Thanks in advance!
... View more
09-01-2017
12:22 PM
@Kiem Nguyen I highly recommend starting a new question in Hortonworks community connection for this. Diagnosing what caused your node to disconnect and how to resolve is a different topic from how to stop a processor with a disconnected node. It would also be helpful to explain what you mean by "overloaded queue" and what makes you feel the size of your queue triggered your node to disconnect. What error did you see in the nifi-app.log on the node that disconnected. Thanks, Matt
... View more
08-01-2017
03:08 PM
@Foivos A The banner is a NiFi core feature and is not tied in anyway to the dataflows you select or build on your canvas. You are correct that the best approach for identifying which dataflows on a single canvas are designated dev, test, or production is through the use of "labels". In a secure NiFi setup, you can use NiFi granular multi-tenancy user authorization to control what components a user can interact with an view. If you use labels, you should set a policy allowing all user to view that specific component, so even if they are not authorized to access the labeled components, they will be able to see why via the label text. Thanks, Matt
... View more