Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Master Guru

Simple Apache NiFi Operations Dashboard

This is an evolving work in progress, please get involved everything is open source. @milind pandit and I are working on a project to build something useful for teams to analyze their flows, current cluster state, start and stop flows and have a rich one look dashboard.

There's a lot of data provided by Apache NiFi and related tools to aggregate, sort, categorize, search and eventually do machine learning analytics on.

There are a lot of tools that come out of the box that solve parts of these problems. Ambari Metrics, Grafana and Log Search provide a ton of data and analysis abilities. You can find all your errors easily in Log Search and see nice graphs of what is going on in Ambari Metrics and Grafana.

92917-logsearchoverview.png

What is cool with Apache NiFi is that is has SitetoSite tasks for sending all the provenance, analytics, metrics and operational data you need to wherever you want it. That includes to Apache NiFi! This is Monitoring Driven Development (MDD).


Monitoring Driven Development (MDD)

MDD - https://pierrevillard.com/2018/08/29/monitoring-driven-development-with-nifi-1-7/

92908-mdd.png

92909-mddtasks.png

92910-avrowriter.png

In this little proof of concept work, we grab some of these flows process them in Apache NiFi and then store them in Apache Hive 3 tables for analytics. We should probably push the data to HBase for aggregates and Druid for time series. We will see as this expands.

There are also other data access options including the NiFi REST API and the NiFi Python APIs.

Boostrap Notifier

Reporting Tasks

  • AmbariReportingTask (global, per process group)
  • MonitorDiskUsage(Flowfile, content, provenance repositories)
  • MonitorMemory

Monitor Disk Usage

MonitorActivity

See:

https://nipyapi.readthedocs.io/en/latest/readme.html

https://community.hortonworks.com/articles/177301/big-data-devops-apache-nifi-flow-versioning-and-au...

These are especially useful for doing things like purging connections.

Purge it!

  • nipyapi.canvas.purge_connection(con_id)
  • nipyapi.canvas.purge_process_group(process_group, stop=False)
  • nipyapi.canvas.delete_process_group(process_group, force=True, refresh=True)

 

92913-nifiopsdashfeeds.png

92914-nificli.png

 

Use Cases

92911-findmemoryhigh.png

92915-mddbackpressure.png

92912-disableaprocessor.png

Example Metrics Data

[ {
  "appid" : "nifi",
  "instanceid" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a",
  "hostname" : "#.#.hortonworks.com",
  "timestamp" : 1539411679652,
  "loadAverage1min" : 0.93,
  "availableCores" : 16,
  "FlowFilesReceivedLast5Minutes" : 14,
  "BytesReceivedLast5Minutes" : 343779,
  "FlowFilesSentLast5Minutes" : 0,
  "BytesSentLast5Minutes" : 0,
  "FlowFilesQueued" : 59952,
  "BytesQueued" : 294693938,
  "BytesReadLast5Minutes" : 241681,
  "BytesWrittenLast5Minutes" : 398753,
  "ActiveThreads" : 2,
  "TotalTaskDurationSeconds" : 273,
  "TotalTaskDurationNanoSeconds" : 273242860763,
  "jvmuptime" : 224997,
  "jvmheap_used" : 5.15272616E8,
  "jvmheap_usage" : 0.9597700387239456,
  "jvmnon_heap_usage" : -5.1572632E8,
  "jvmthread_statesrunnable" : 11,
  "jvmthread_statesblocked" : 2,
  "jvmthread_statestimed_waiting" : 26,
  "jvmthread_statesterminated" : 0,
  "jvmthread_count" : 242,
  "jvmdaemon_thread_count" : 125,
  "jvmfile_descriptor_usage" : 0.0709,
  "jvmgcruns" : null,
  "jvmgctime" : null
} ]

Example Status Data

{
  "statusId" : "a63818fe-dbd2-44b8-af53-eaa27fd9ef05",
  "timestampMillis" : "2018-10-18T20:54:38.218Z",
  "timestamp" : "2018-10-18T20:54:38.218Z",
  "actorHostname" : "#.#.hortonworks.com",
  "componentType" : "RootProcessGroup",
  "componentName" : "NiFi Flow",
  "parentId" : null,
  "platform" : "nifi",
  "application" : "NiFi Flow",
  "componentId" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a",
  "activeThreadCount" : 1,
  "flowFilesReceived" : 1,
  "flowFilesSent" : 0,
  "bytesReceived" : 1661,
  "bytesSent" : 0,
  "queuedCount" : 18,
  "bytesRead" : 0,
  "bytesWritten" : 1661,
  "bytesTransferred" : 16610,
  "flowFilesTransferred" : 10,
  "inputContentSize" : 0,
  "outputContentSize" : 0,
  "queuedContentSize" : 623564,
  "activeRemotePortCount" : null,
  "inactiveRemotePortCount" : null,
  "receivedContentSize" : null,
  "receivedCount" : null,
  "sentContentSize" : null,
  "sentCount" : null,
  "averageLineageDuration" : null,
  "inputBytes" : null,
  "inputCount" : 0,
  "outputBytes" : null,
  "outputCount" : 0,
  "sourceId" : null,
  "sourceName" : null,
  "destinationId" : null,
  "destinationName" : null,
  "maxQueuedBytes" : null,
  "maxQueuedCount" : null,
  "queuedBytes" : null,
  "backPressureBytesThreshold" : null,
  "backPressureObjectThreshold" : null,
  "isBackPressureEnabled" : null,
  "processorType" : null,
  "averageLineageDurationMS" : null,
  "flowFilesRemoved" : null,
  "invocations" : null,
  "processingNanos" : null
}

 

Example Failure Data

[ {
  "objectId" : "34c3249c-4a42-41ce-b94e-3563409ad55b",
  "platform" : "nifi",
  "project" : null,
  "bulletinId" : 28321,
  "bulletinCategory" : "Log Message",
  "bulletinGroupId" : "0b69ea51-7afb-32dd-a7f4-d82b936b37f9",
  "bulletinGroupName" : "Monitoring",
  "bulletinLevel" : "ERROR",
  "bulletinMessage" : "QueryRecord[id=d0258284-69ae-34f6-97df-fa5c82402ef3] Unable to query StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40: org.apache.nifi.processor.exception.ProcessException: Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40",
  "bulletinNodeAddress" : null,
  "bulletinNodeId" : "91ab706b-5d92-454e-bc7a-6911d155fdca",
  "bulletinSourceId" : "d0258284-69ae-34f6-97df-fa5c82402ef3",
  "bulletinSourceName" : "QueryRecord",
  "bulletinSourceType" : "PROCESSOR",
  "bulletinTimestamp" : "2018-10-18T20:54:39.179Z"
} ]

 

Apache Hive 3 Tables

CREATE EXTERNAL TABLE IF NOT EXISTS failure (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC
   LOCATION '/failure';

CREATE EXTERNAL TABLE IF NOT EXISTS bulletin (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC
LOCATION '/error';


CREATE EXTERNAL TABLE IF NOT EXISTS memory (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC
LOCATION '/memory'
;


// backpressure
CREATE EXTERNAL TABLE IF NOT EXISTS status (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC
   LOCATION '/status';








 


mddtasks.png

7,832 Views