Created 07-10-2023 11:00 AM
Hello Experts,
I am using ConsumeAzureEventHub processor to consume messages in Nifi 1.16.
'Concurrent Tasks' field of this processor is disabled so I am not able to speed up the consume processor.
What is the reason for this limitation and is there any work around for this ?
(we usually run kafka consumer with higher concurrency matching partitions)
Thanks,
Mahendra
Created 07-11-2023 05:50 AM
@hegdemahendra
The ConsumeAzureEventHub processor utilizes the AzureSDK library. The AzureSDK manages an internal thread pool for consuming events outside of the control of the NiFi processor. When the processor is started the single concurrent task thread is used to give a handle to the AzureSDK Session Factory. At that point all thread control pertaining to the consumption of events is outside the bounds of the NiFi framework. There are unfortunate side effects caused by using the AzureSDK in this processor. Since thread handling is moved out of the NiFi framework control, backpressure on the outbound connection(s) of this processor are not going to be honored. NiFi Framework handles backpressure as a soft limit and once the threshold on the connection is met or exceeded, the framework stops scheduling the source component. Since the ConsumeAzureEventHub processor on startup only gets "scheduled" once (hands off thread to AzureSDK to handle thread pool and consumption, the processor never needs to be scheduled again (always on). Only stopping the processor will stop the underlying AzureSDK from consuming more events. Risk here is that if a downstream issue happens, the connection(s) outbound from the ConsumeAzureEventHub processor will continue to grow until you run out of disk space on the FlowFile or content repository disks.
The following Jira covers this limitation:
https://issues.apache.org/jira/browse/NIFI-10353
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.
Thank you,
Matt
Created 07-11-2023 05:50 AM
@hegdemahendra
The ConsumeAzureEventHub processor utilizes the AzureSDK library. The AzureSDK manages an internal thread pool for consuming events outside of the control of the NiFi processor. When the processor is started the single concurrent task thread is used to give a handle to the AzureSDK Session Factory. At that point all thread control pertaining to the consumption of events is outside the bounds of the NiFi framework. There are unfortunate side effects caused by using the AzureSDK in this processor. Since thread handling is moved out of the NiFi framework control, backpressure on the outbound connection(s) of this processor are not going to be honored. NiFi Framework handles backpressure as a soft limit and once the threshold on the connection is met or exceeded, the framework stops scheduling the source component. Since the ConsumeAzureEventHub processor on startup only gets "scheduled" once (hands off thread to AzureSDK to handle thread pool and consumption, the processor never needs to be scheduled again (always on). Only stopping the processor will stop the underlying AzureSDK from consuming more events. Risk here is that if a downstream issue happens, the connection(s) outbound from the ConsumeAzureEventHub processor will continue to grow until you run out of disk space on the FlowFile or content repository disks.
The following Jira covers this limitation:
https://issues.apache.org/jira/browse/NIFI-10353
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.
Thank you,
Matt
Created 07-16-2023 09:37 PM
@MattWho Its very clear now, thank you so much !