Created 06-11-2024 05:01 AM
Hi ,
I need cloudera's\community's help with this because I have been struggling with it for some time. we got new VMs that going to be dedicated to Nifi. We have been using Nifi 2.0.0 M1 on an old physical servers with custom developed python custom extensions and we did not see any issues. When we moved to the new VM servers ( See spec below) at that time M2 was released and we decided to go with the latest, however once its installed and we got it working we started noticing the CPU utilization Peaks to almost 99-100% ! When I check the Processes under task manager I notice this utilization is taken by Java that is running Nifi:
This happens even without running anything on Nifi. I reverted back to M1 release thinking maybe its an M2 issue since M1 did not have this issue on the old physical servers, however the problem persisted even when using the same Java & Python versions.
After some troubleshooting and going through process of elimination, I found that if I dont enable the python extension or dont deploy any of those extensions ( while its enabled) the issue is gone !!! It seems there is something with the python extension that is causing the CPU to peak. One thing I noticed when I check the task manager is too many python processes even though nothing is running at that point, and they dont seem to use much of the CPU however having them in that number seem to be strange :
The more I add python extension those processes kind of double in number! In the old server (M1 release ) I counted over 35 python processes running and I only have about 8 extensions! Nothing was running in Nifi at that time.
Im going to try and test this with M3 release since there have been some changes and bug fixes when it comes to python extension to see if the issue persist.
Has anyone run into this issue before? If so can you please advise. If not, Is there a way at least to get to the bottom of this and know exactly what is happening or what is causing it. Anything to help with providing our IT with something to work with would be highly appreciated.
The server spec Im running Nifi on:
Thanks
Created on 06-12-2024 06:43 AM - edited 06-12-2024 06:45 AM
Thanks to @MattWho and @pvillard was able to find an internal discussion and jira that suggest this is resolved in M3.
https://issues.apache.org/jira/browse/NIFI-12757
@SAMSAL Can you test M3 and report your results?
Created on 06-12-2024 06:43 AM - edited 06-12-2024 06:45 AM
Thanks to @MattWho and @pvillard was able to find an internal discussion and jira that suggest this is resolved in M3.
https://issues.apache.org/jira/browse/NIFI-12757
@SAMSAL Can you test M3 and report your results?
Created 06-12-2024 07:28 AM
Dear @steven-matison ,
Let me say first you are AWESOME for going as far as you did to help me figure this thing out. I cant thank you enough.
I think my lesson learned here is that I should not rely only the release notes to see if a major issue like this has been addressed and probably always review the bug fixes under Jira to help me decide wither an upgrade is worth it or not. However having said that , I still believe something as critical as memory leak and high CPU utilization should have been mentioned as part of the highlights: https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version2.0.0-M3
I really wish I knew this thing earlier, it would have saved me days of troubleshooting and the stress of being under the watch of IT because they get a notification every time CPU utilization crosses certain threshold for some amount of time.
I started testing M3 yesterday and yes I can confirm that so far that there is no issues with CPU utilization. I was about to announce to the team that an upgrade to M3 is needed. One thing Im not sure of is that I'm still seeing more Python processes in the taskmgr than what I have deployed:
Not sure if this normal and if @MattWho or @pvillard have anything to say about this.
Finally you also saved me is the time to trying to figure out why M3 is working and if this is a known issue or not so that I can post my results to the community in case someone else runs into this issue. God knows how much time I would have spent on this. Hopefully this post is enough to help others avoid the headache I had to go through.
Thanks
Dear @steven-matison ,
Let me say first you are AWESOME for going as far as you did to help me figure this thing out. I cant thank you enough.
I think my lesson learned here is that I should not rely only the release notes to see if a major issue like this has been addressed and probably always review the bug fixes under Jira to help me decide wither an upgrade is worth it or not. However having said that , I still believe something as critical as memory leak and high CPU utilization should have been mentioned as part of the highlights: https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version2.0.0-M3
I really wish I knew this thing earlier, it would have saved me days of troubleshooting and the stress of being under the watch of IT because they get a notification every time CPU utilization crosses certain threshold for some amount of time.
I started testing M3 yesterday and yes I can confirm that so far that there is no issues with CPU utilization. I was about to announce to the team that an upgrade to M3 is needed. One thing Im not sure of is that I'm still seeing more Python processes in the taskmgr than what I have deployed:
Not sure if this normal and if @MattWho or @pvillard have anything to say about this.
Finally you also saved me is the time to trying to figure out why M3 is working and if this is a known issue or not so that I can post my results to the community in case someone else runs into this issue. God knows how much time I would have spent on this. Hopefully this post is enough to help others avoid the headache I had to go through.
Thanks
Created 06-12-2024 07:49 AM
Appreciate the positive feedback, but i differ to @MattWho. He had already reached out on your behalf. This is beauty of the community here, we are all friends working toward common goals. :high five:
To be clear, 2.0 is not officially ready for prime time, so you need to be careful with expectations and vigilant with your own evaluation and testing. You also need to be ready with process to upgrade to the next version(s) as they come out.
Created 06-12-2024 09:03 AM
Big shout out to @MattWho who is been incredibly helpful in this community. I have learned a lot from his answers and posts wither directly to issues I posted or through others. I dont think anyone can match the knowledge and the level of details he bosses when writing about Nifi.
Created 06-13-2024 07:25 AM
@SAMSAL
Thank you for the kind words. Likewise, the community thrives through members like yourself. Thank you for all your amazing contributions.