Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to identify bottlenecking issue between Nifi versions

avatar
Explorer

There has been this Nifi issue since late June, about a month after the client had upgraded their Nifi version from 1.12.1 to 1.16.0. The process is essentially a lot of Windows file handling. Checking if a file exists, moving it to different folders, unzipping files, then sorting files by a certain prefix into SQL called store procedures.

 

This process that begins at midnight daily is supposed to be completed by 7am, but the client complaint was that many files were beginning to go over this deadline, only completing much later. Analysing the available data shows that the number of files being processed per hour dropped from about 3000 in version 1.12.1 to about 2000 in 1.16.0. The client ruled out Nifi version upgrade as a cause because that was about a month before they brought up the complaint. However we recently upgraded again from 1.16.0 to 1.17.0 and suddenly the issue has been resolved. But we have yet to identify what the issue causing the file process bottlenecking was. 

 

Among some solutions attempted prior to upgrading the version were to increase concurrent thread count of some processes from 5 to 10 threads, and to set both min and max of the JVM heap to 16GB, but none of which worked. Any suggestion on what we can further explore is much appreciated.

1 REPLY 1

avatar
Super Mentor

@niclyx 

 

You need to be careful with increasing concurrent tasks on components.  Doing so can actually decrease performance if not done carefully.  All components are  requesting thread from the configured thread pools. Over subscribing on one processor can adversely affect others.

Did you check disk and network I/O during timeframe of decreased performance?
Did you inspect NiFi app log for any warns or errors related to writing new content to content_repository?
Did you monitor JVM garbage collection stats (how often and how long they took)?
Was performance throughput tied to specific processors within the dataflow(s)? 
Where the bottleneck points all dealing with interfacing with an external service?

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt