About MattWho

MattWho · ‎05-20-2021

@Kilynn The following property in the nifi.properties file controls when a swap file is created per connection. nifi.queue.swap.threshold=20000 This is per connection and not for all FlowFiles across all connections. A FlowFile swap file will always consist of 10000 FlowFiles. So if a connection reaches 20000 queued FlowFiles, a swap file will be created for 10000 of those. So if a connection queue reaches 40000, you would have 3 swap files of that connection. You can control individual connection queues by setting the "Back pressure Object Threshold" on a connection: Note: Threshold settings are soft limits And default for object threshold is 10000. So with these settings there should be very little to no swapping of FlowFiles to disk happening at all. Swap files would only happen if source processor to that connection output enough FlowFiles to connection at one time to trigger a swap file. For example: - Connection has 9000 queued FlowFiles, so back pressure is not being applied. - Source processor is thus allowed to execute - Source processor upon execution produces 12000 FlowFiles - now downstream connection has 21000 queued FlowFiles. One swap file is produced and back pressure is enabled until queue drops back below 10000 queued FlowFiles. FlowFiles consist of two parts (FlowFile attributes/metadata and FlowFileContent). The only portion of a FlowFile held in heap memory is the FlowFile attributes/Metadata. FlowFile content is never held in memory (Some processors may load content in to memory in order to perform their function only). FlowFile attributes/metadata is persisted to the flowfile repository and FlowFile content is written to the content repository. This important to avoid data loss if NiFi dies or is restarted while data still exists in connection queues. If you found this helped with your query, please take a moment to login and click accept in the solutions that helped. Thank you, Matt

leandrolinof · ‎05-17-2021

@MattWho I was able to use the following structure in the flow = Thanks a lot for the help. 😃

MattWho · ‎05-17-2021

@rahul_ars You can use the "UpdateCounter" [1] processor to create counters and update the count on them (up and down). The "Counter Name" property accepts NiFi EL, so you can use attribute son FlowFiles to dynamically update a counter uniquely by FlowFile traversing same dataflow path. This will allow you to consolidate within your dataflow(s) to reduce the number of processors needed. Fewer processors can lead to better performance through reduced resource utilization. The "Counters" UI is found under the NiFi Global menu in the upper right corner of the NiFi UI. [1] http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.UpdateCounter/index.html If you found this help with your query, please take a moment to login and click accept on this solution. Thanks, Matt

MattWho · ‎05-17-2021

@techNerd I don't know anything about the Wildfly endpoint service, so I can only assume your URL is correct. Using the IP as you have should not be an issue as long as the NiFi host can reach that network address. What you are seeing is a timeout which indicates the endpoint you are trying to reach did not respond to you post request. There could be numerous reasons for this such as network issues, incorrect or missing headers on post request, bad endpoint URL, too many concurrent connections to the endpoint at time of this request, failed authentication, using http for a https endpoint, bad or no client authentication, etc... I would suggest you monitor the logs on Wildfly when the post request is made by NiFi to see if: 1. Wildfly acknowledges receiving the request. 2. Wildfly does not throw and exception about the request. Since the endpoint URL you shared had multipart in it, I assumed that perhaps it only accepts multi-part form data and thus may be expecting the proper header fo this type of data. And if it is multi-part form data, you are going to have better success using the postHTTP processor instead of the InvokeHTTP processor. Hope this helps you, Matt

sangee · ‎05-13-2021

Thanks a lot

MattWho · ‎05-13-2021

@ThangND-2210 Thank you for filing Apache Jira detailing your observations here: https://issues.apache.org/jira/browse/NIFI-8541 I see that Apache NiFi community committers are looking in to your Jira. Thank you, Matt

kkau · ‎05-12-2021

Thanks @MattWho Sure, I will consider your suggestion to not running multiple nifi on the same machine. I tried the variable registry approach as well but the problem is the same with that as well that we can not use the EL parameter.

MattWho · ‎05-11-2021

@dieden9 NiFi provides a number of Kafka processors based off the Kafka Client they are using. The original ConsumeKafka processor (no number) used the old Kafka 0.8 client. The 0.8 client processor does not offer the ability to specify a regex for the topic names. You should be using the Kafka client version processors that match the Kafka server version you are consuming from. From ConsumeKafka_0_10 [1] on, you have the ability to configure the processor to use "names" or "pattern" for the topic name(s). The pattern is a java regular expression. [1] http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-1-0-nar/1.13.2/org.apache.nifi.processors.kafka.pubsub.ConsumeKafka_1_0/index.html If you found this help with your query, please take a moment to login and click accept on this solution. Thank you, Matt

Allen123 · ‎05-10-2021

Thanks, it's help me a lot.

MattWho · ‎05-10-2021

@Nickanor It would be interesting to see a verbose listing of your NiFi logs directory once it has well exceeded 50 GB archived log files. What you have configured will retain 30 hours of log data. With each of those 30 hours you may have 1 or more incremental log files (each at 100 MB except for last one each hour). On NiFi restart do you see that the following is cleaning up the archive directory of files older than 30 hours: <cleanHistoryOnStart>true</cleanHistoryOnStart> I would be inspecting the nifi-app.log to see if you encounter any exceptions around logback or if you see any OutOfMemory (OOM) or no more files (file limits) exceptions that may explain the behavior. Hope this helps, Matt

Online	Online
Last Visited	‎02-05-2026 02:14 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-05-2026 02:14 PM
Posts	3,436
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: nifi 2.6 registry security scan results

Re: NIFI 1.13.2 Cluster with Randomly Restarting N...

Re: Error Routing to Failure

Re: Nifi Workflow to store count of success and fa...

Re: How can I use NiFi to Invoke a HTTP and send a...

Re: if the flow file contains "HdF5 file is missin...

Re: NiFi open too many library files on centos 7

Re: PrometheusReportingTask with same port and mul...

Re: Consume Kafka topics using wildcard

Re: How use ListSFTP to find particular file when...

Re: NiFi 1.13.2 logs config not working