Support Questions

Find answers, ask questions, and share your expertise

How to process failed records in CDC?

avatar
Explorer

@Matt Burgess

I have nifi job to capture cdc from MySQL DB.I am facing one issue with my nifi job in case of any failure.

76482-screen-shot-2018-05-30-at-124053-pm.png

Suppose take an example if there is a failure while inserting record to DB using "PutSQL" processor in attached image because of issue in previous step "UpdateAttribute" processor.How to reprocessed failed records again to DB if I have resolved issue in "UpdateAttribute" processor?Is there any way to reprocessed records in sequence ? Let me give you details the problem I am facing is suppose my sql args for 12 was data type string but while converting it in "UpdateAttribute" I have given it to some other data type, so "PutSQL" will throw error and it will not insert record.Now suppose if I have changed "UpdateAttribute" with correct datatype then how I can reprocessed failed records using same job?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Siddharth Sa

From your image it appears that you are auto terminating the failure relationship on your putSQL processor?

-

Assuming the misconfiguration in your updateAttribute processor resulting in failure of every FlowFile passed to the putSQL, those FlowFiles would have all been routed to the failure relationship of the putSQL processor. It is rare that user would auto terminate a "failure" relationship as it means data is being deleted. A more typical design is to route "failure" relationships to a dumb processor that is not enabled (like an updateAttribute processor or even a funnel). This would have allowed you to redirect the connection containing that failure relationship back to your fixed updateAttribute processor resulting in all the failed data being reprocessed.

-

NiFi does if enable archive FlowFiles based on configured thresholds. It is possible to perform a provenance search on the FlowFiles with a "DROP" event recorded by the putSQL processor. The drop event would occur for each FlowFile routed to failure and deleted by putSQL. While not elegant, you may be able to select each failed FlowFile one by one, open the lineage, and replay the FlowFile at the "updateAttribute" point in the lineage history. You would control the sequence of processing by the older in which you replay each FlowFile. There is no bulk replay capability.

-

Thank you,

Matt

View solution in original post

2 REPLIES 2

avatar
Master Mentor
@Siddharth Sa

From your image it appears that you are auto terminating the failure relationship on your putSQL processor?

-

Assuming the misconfiguration in your updateAttribute processor resulting in failure of every FlowFile passed to the putSQL, those FlowFiles would have all been routed to the failure relationship of the putSQL processor. It is rare that user would auto terminate a "failure" relationship as it means data is being deleted. A more typical design is to route "failure" relationships to a dumb processor that is not enabled (like an updateAttribute processor or even a funnel). This would have allowed you to redirect the connection containing that failure relationship back to your fixed updateAttribute processor resulting in all the failed data being reprocessed.

-

NiFi does if enable archive FlowFiles based on configured thresholds. It is possible to perform a provenance search on the FlowFiles with a "DROP" event recorded by the putSQL processor. The drop event would occur for each FlowFile routed to failure and deleted by putSQL. While not elegant, you may be able to select each failed FlowFile one by one, open the lineage, and replay the FlowFile at the "updateAttribute" point in the lineage history. You would control the sequence of processing by the older in which you replay each FlowFile. There is no bulk replay capability.

-

Thank you,

Matt

avatar
Explorer

Thanks @Matt Burgess