Member since
06-28-2024
3
Posts
3
Kudos Received
0
Solutions
07-02-2024
11:54 PM
1 Kudo
Yes, almost same behavior is observed with retry strategy as "penalize". Just the additional penalty duration gets added into the time. For example by default the penalty duration is 30 secs, if incoming flow files are 10 and number of retries is 1. For this case 10 flow files are clubbed up and first retry happens at 50secs. Then for 30secs it penalizes the clubbed flow files. Then after 50secs it goes into the failure relationship. So, In total (numberOfRetries+1)*5secs*(numberOfInComingFlowFiles) + Penalty duration time taken by publishKafka processor to route file into failure relationship in case of penalize retry policy. If retry is not checked then similar behavior like yield is observed 5*numberOfIncomingFlowFiles secs to route to failure relationship as shown in photos. Penalty and yield settings are default only. target kafka version is 3.4.0 and number of partition is 1. Number of nifi nodes are 3. Number of concurrent Tasks on PublishKafkaRecord is 1, but the execution is on all nodes, which is I think 1 thread on 3 nodes each.
... View more
07-01-2024
09:32 PM
1 Kudo
@MattWho thanks for the reply 🙂. Yes, I am having two different kafka clusters(i.e. apk kakfa cluster and dap kafka cluster). I am doing error handling of apk kafka cluster in case it goes down, by sending the failed flow files in a topic of dap kafka. In case of failure, I tried out few other things and found out a trend. If apk kafka cluster goes down then nifi tries to make connection to the apk kafka cluster in interval of 5000ms as shown in below picture. One more thing I observed if I had incoming "x" flow files to publishKafka processor then it somehow group up the flowfiles and then takes 5*x seconds to go to the failure queue. Here is the photo to prove above mentioned statement, as you can see on the right hand side in the publishKafka processor incoming files are 50 and the processor is showing Tasks/Time = 1/00:04:10... which is 4 mins and 10 secs = 250secs. And in case of huge number of incoming flowfiles then the max clubed size of flowfiles is 500 and rest of the flow files stay waiting in the incoming queue of publishkafka processor. Below, image showing Tasks/TIme = 41mins 40 secs = 500*5 = 2500secs for the first batch of 500 flowfiles. Similarly, for second group of 500 flowfiles, taking same time to go into failure(i.e. 2500secs). Also, If I give number of retries greater then 0. In this case, still the interval remains the same(i.e. 5 secs) and clubing up of flow files also happens. Let's say number of retries is "n" then it would be taking (n+1)*5*x seconds to go into the failure queue. (x is number of incoming flow files.)
... View more
06-28-2024
04:35 AM
1 Kudo
I want to do error handling of publish kafka processor in case of kafka goes down in apache nifi by retry mechanism. Retry mechanism needs to be implemented in separate flow(remove load from main flow). Nifi version = 1.18.0 So, in case of failure of publishKafka processor should pass flow file in the failure relationship. Which is working properly in case of low count of incoming flow files. Example- Initial image generated few flow files for failure case Final image after some time Flow files successfully directed to failure relationship- As the number of incoming flow files increases the flow files are not able to routed to the failure relationship by publishKafka processor.(which is the real life scenario, if kafka goes down then publishKafka processor will apply back pressure to incoming flow files.) Here is my publishKafka processor configs I tried using inbuilt retry mechanism of nifi processor But still the issue is not resolved, publish kafka processor not forwarding flow files in failure relationship in case of error.
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi