Created 07-21-2020 04:08 AM
Hi,
being pretty new to Nifi, I am strugeling to extract interface monitoring data
Assuming a very simple ingest flow like:
HttpListener -> KafkaPublisher
How can I get throughput information, i.e. number of records put (e.g.) every minute to a topic?
There is NifiSummary -> Processors -> Status History, which might be useful. Can the statistics be accessed programatically and how?
Best regards
Jaro
Created on 07-21-2020 09:19 AM - edited 07-21-2020 09:24 AM
The easiest way to grab monitoring data is via the NiFi REST API. Also everything in the NiFi UI is done through REST calls which you can call programmatically. Please read the NiFi docs they are linked directly from your running NiFi application or on the web. They are very thorough and have all the information you could want: https://nifi.apache.org/docs/nifi-docs/. If you are not running NiFi 1.11.4, I recommend you please upgrade. This is supported by Cloudera on multiple platforms.
NiFi Rest API
https://nifi.apache.org/docs/nifi-docs/rest-api/
There's also an awesome Python wrapper for that REST API: https://pypi.org/project/nipyapi/
Also in NiFi flow programming, every time you produce data to Kafka you get metadata back in FlowFile Attributes. You can push those attributes directly to a kafka topic if you want.
So after your PublishKafkaRecord_2_0 1.11.4 so for success read the attributes on # of record and other data then AttributesToJson and push to another topic. you may want a mergerecord in there to aggregate a few of those together.
If you are interested in Kafka metrics/record counts/monitoring then you must use Cloudera Streams Messaging Manager, it provides a full Web UI, Monitoring Tool, Alerts, REST API and everything you need for monitoring every producer, consumer, broker, cluster, topic, message, offset and Kafka component.
The best way to get NiFi stats is to use the NiFi Reporting Tasks, I like the SQL Reporting task.
SQL Reporting Tasks are very powerful and use standard SELECT * FROM JVM_METRICS style reporting, see my article:
https://www.datainmotion.dev/2020/04/sql-reporting-task-for-cloudera-flow.html
Monitoring Articles
https://www.datainmotion.dev/2019/04/monitoring-number-of-of-flow-files.html
https://www.datainmotion.dev/2019/03/apache-nifi-operations-and-monitoring.html
Other Resources
https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_9.html
https://www.datainmotion.dev/2019/08/using-cloudera-streams-messaging.html
https://dev.to/tspannhw/apache-nifi-and-nifi-registry-administration-3c92
https://dev.to/tspannhw/using-nifi-cli-to-restore-nifi-flows-from-backups-18p9
https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html
https://www.datainmotion.dev/p/links.html
https://www.tutorialspoint.com/apache_nifi/apache_nifi_monitoring.htm
Created on 07-21-2020 09:19 AM - edited 07-21-2020 09:24 AM
The easiest way to grab monitoring data is via the NiFi REST API. Also everything in the NiFi UI is done through REST calls which you can call programmatically. Please read the NiFi docs they are linked directly from your running NiFi application or on the web. They are very thorough and have all the information you could want: https://nifi.apache.org/docs/nifi-docs/. If you are not running NiFi 1.11.4, I recommend you please upgrade. This is supported by Cloudera on multiple platforms.
NiFi Rest API
https://nifi.apache.org/docs/nifi-docs/rest-api/
There's also an awesome Python wrapper for that REST API: https://pypi.org/project/nipyapi/
Also in NiFi flow programming, every time you produce data to Kafka you get metadata back in FlowFile Attributes. You can push those attributes directly to a kafka topic if you want.
So after your PublishKafkaRecord_2_0 1.11.4 so for success read the attributes on # of record and other data then AttributesToJson and push to another topic. you may want a mergerecord in there to aggregate a few of those together.
If you are interested in Kafka metrics/record counts/monitoring then you must use Cloudera Streams Messaging Manager, it provides a full Web UI, Monitoring Tool, Alerts, REST API and everything you need for monitoring every producer, consumer, broker, cluster, topic, message, offset and Kafka component.
The best way to get NiFi stats is to use the NiFi Reporting Tasks, I like the SQL Reporting task.
SQL Reporting Tasks are very powerful and use standard SELECT * FROM JVM_METRICS style reporting, see my article:
https://www.datainmotion.dev/2020/04/sql-reporting-task-for-cloudera-flow.html
Monitoring Articles
https://www.datainmotion.dev/2019/04/monitoring-number-of-of-flow-files.html
https://www.datainmotion.dev/2019/03/apache-nifi-operations-and-monitoring.html
Other Resources
https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_9.html
https://www.datainmotion.dev/2019/08/using-cloudera-streams-messaging.html
https://dev.to/tspannhw/apache-nifi-and-nifi-registry-administration-3c92
https://dev.to/tspannhw/using-nifi-cli-to-restore-nifi-flows-from-backups-18p9
https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html
https://www.datainmotion.dev/p/links.html
https://www.tutorialspoint.com/apache_nifi/apache_nifi_monitoring.htm