Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

5.jpg

 

 

Application deployment has been significantly proliferated by Kubernetes. However, true universal log capture with multi endpoint (downstream) support is lacking. Apache NiFi Stateless provides a possibility to bridge the gap between rapid application deployment and InfoSecs desire to continue to capture and monitor behaviors.  

What is NiFi Stateless? 

NiFi-Fn is a library for running NiFi flows as stateless functions. It provides delivery guarantees similar to NiFi, without the need for an on-disk repository, by waiting to confirm receipt of incoming data until it has been written to the destination (source NIFI-5922).

Try it out

Prerequisites

  • K8s (local or cluster).  In this demonstration, Azure Kubernetes Service is used.
  • Some familiarity with K8s & NiFi

Assets Used

Laying the groundwork

NiFi Stateless will pull an existing flow from NiFi Registry.  The following is a simple NiFi flow designed in NiFi:

2.jpg

 

TailFile processor will tail the application log file /var/log/app.txt.  The application deployed will write log entries to this file:

3.jpg

 

The flow is checked into NiFi Registry.  NiFi Registry URL, Bucket Identifier & Flow Identifier will be used by NiFi Stateless at run time. More about this soon.

 

4.jpg

 

Time to deploy

The flow has been registered into NiFi Registry, therefore the application pod can be deployed.  A NiFi Stateless container will be deployed in the same application Pod (sidecar) to capture the log data generated from the application.  The application being deployed is simple. It is a dummy application that generates a timestamp log entry every 5 seconds into a log file (/var/log/app.txt).  NiFi stateless will tail this file and ship the events. The event can be shipped virtually anywhere due to NiFi’s inherent universal log forward compatibility. (Kafka/Splunk/ElasticSearch/Mongo/Kinesis/EventHub/S3/ADLS/etc).  All NiFi processors are in https://nifi.apache.org/docs.htmlFor this demonstration, the log event will be shipped to a NiFi cluster over Site2Site.

 

Here is the K8s YAML to deploy the Pod (application with NiFi Stateless sidecar): https://github.com/sunileman/AKS-YAMLS/blob/master/nifi-stateless-sidecar.yml

 

In that YAML file, NiFi Registry URL, bucketId, and flowId will need to be updated. These values are from the NiFi registry. NiFi Stateless binds itself at runtime to a specific flow to execute.

 

 

 

    args: ["RunFromRegistry", "Continuous", "--json", "{\"registryUrl\":\"http://nifiregistry-service\",\"bucketId\":\"71efc3ea-fe1d-4307-97ce-589f78be05fb\",\"flowId\":\"c9092508-4deb-45d2-b6a4-b6a4da71db47\"}"]

 

 

To deploy the Pod, run the following:

 

 

kubectl apply -f nifi-statless-sidecar.yml

 

 

Once the pod is up and running, immediately application log events are captured by NiFi Stateless containers and shipped downstream.

Wrapping Up

FluentD and similar offerings are great for getting started to capture application log data. However, enterprises require much richer connectivity (Universal Log Forward Compatibility) to enable InfoSec to perform their vital role. NiFi Stateless bridges that current gap. 

475 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
14 of 14
Last update:
‎04-08-2020 11:34 PM
Updated by:
 
Top Kudoed Authors