Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Monitoring Nifi - flow file failures/ success

avatar
Rising Star

I am working on building a monitoring solution for my Nifi workflow (Real time data lake using Golden-gate/ Kafka). Currently I am storing all the Goldgen gate records/flowfiles received from kafka in to a hdfs directory and at the end of the workflow ,in case its ingested successfully in hbase the flowfile is deleted from the directory. So i know that the json format flowfiles left in the Hdfs directory are ones that have failed. Now the issue with finding the reason of the failure is that Nifi's bulletin board shows only the record for last 5/10 mins (not sure of the duration) . I've tried storing the bulletin board messages in hdfs using the REST API but the Json generated from there is very detailed and would require a lot of work before I can use it for monitoring purpose.

Has anyone else worked with such type of monitoring? I would also like to know the throughput of the workflow which would include the no of records failed or successfully ingested etc. I know i can get the last-5-min stat from the status history. But if any one else has worked on a similar monitoring task kindly let me know.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

We usually use Nifi Content and Provenance repository to troubleshoot failed flow files. Both are set to 7 days of retention. Plus you can replay content to debug.

From zero API reporting perspective you can use row counts as attributes collecting total rows, success rows, failed rows etc. Depending on format of your source file this can be as simple as executing wc -l. Later converting these attributes to JSON and use MergeContent, Schema registry and query record to create an email report.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

We usually use Nifi Content and Provenance repository to troubleshoot failed flow files. Both are set to 7 days of retention. Plus you can replay content to debug.

From zero API reporting perspective you can use row counts as attributes collecting total rows, success rows, failed rows etc. Depending on format of your source file this can be as simple as executing wc -l. Later converting these attributes to JSON and use MergeContent, Schema registry and query record to create an email report.

avatar
Rising Star

Thanks, can you kindly let me know how can I change the retention period of these repositories? (from the nifi properties file I can see these two properties whose unit are the length of time.

nifi.flow.configuration.archive.max.time=30 days

nifi.content.repository.archive.max.retention.period=12 hours

)