Community Articles

Find and share helpful community-sourced technical articles.
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
Master Guru


I need to parse Kerberos KDC Log files (including the currently filling file) to find users with their host that are connecting.

It seems using Grok in NiFi we can parse out a lot of different parts of these files and use them for filtering and alerting with ease.

This is what many of the lines in the log file look like:

Jan 01 03:31:01 somenewserver-310 krb5kdc[28593](info): AS_REQ (4 etypes {18 17 16 23}) ISSUE: authtime 1546278185, etypes {rep=18 tkt=16 ses=18}, nn/ for nn/

State of the Tail Processor


Tail a File


We also have the option of using the GrokReader listed in an article included to immediately convert matching records to output formats like JSON or Avro and then partition into groups. We'll do that in a later article.

In this one, we can get a line from the file via Tail, read a list of files and fetch one at a time or generate a flow file for testing. Once we had some data we'll start parsing into different message types. These messages can then be use for alerting, routing, permanent storage in Hive/Impala/HBase/Kudu/Druid/S3/Object Storage/etc...

In the next step we will do some routing and alerting. Follow up by some natural language processing (NLP), machine learning and then we'll use various tools to search, aggregate, query, catalog, report on and build dashboards from this type of log and others.

Example Output JSON Formatted


  "date" : "Jan 07 02:25:15",
  "etypes" : "2 etypes {23 16}",
  "MONTH" : "Jan",
  "HOUR" : "02",
  "emailhost" : "",
  "TIME" : "02:25:15",
  "pid" : "21546",
  "loghost" : "KDCHOST1",
  "kuser" : "krbtgt",
  "message" : "Additional pre-authentication required",
  "emailuser" : "user1",
  "MINUTE" : "25",
  "SECOND" : "15",
  "LOGLEVEL" : "info",
  "MONTHDAY" : "01",
  "apphost" : "APP_HOST1",
  "kuserhost" : ""


  "date" : "Jan 01 03:20:09",
  "etypes" : "2 etypes {23 18}",
  "MONTH" : "Jan",
  "HOUR" : "03",
  "BASE10NUM" : "1546330809",
  "emailhost" : "",
  "TIME" : "03:20:09",
  "pid" : "24546",
  "loghost" : "KDCHOST1",
  "kuser" : "krbtgt",
  "message" : "",
  "emailuser" : "user1",
  "authtime" : "1546330809",
  "MINUTE" : "20",
  "SECOND" : "09",
  "etypes2" : "rep=23 tkt=18 ses=23",
  "LOGLEVEL" : "info",
  "MONTHDAY" : "01",
  "apphost" : "APP_HOST1",
  "kuserhost" : ""

Grok Expressions

For Parsing Failure Records

%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): %{GREEDYDATA:premessage}failure%{GREEDYDATA:postmessage}

For Parsing PREAUTH Records

%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): AS_REQ \(%{GREEDYDATA:etypes}\) %{GREEDYDATA:apphost}: NEEDED_PREAUTH: %{USERNAME:emailuser}@%{HOSTNAME:emailhost} for %{GREEDYDATA:kuser}/%{GREEDYDATA:kuserhost}, %{GREEDYDATA:message}

For Parsing ISSUE Records

%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): AS_REQ \(%{GREEDYDATA:etypes}\) %{GREEDYDATA:apphost}: ISSUE: authtime %{NUMBER:authtime}, etypes \{%{GREEDYDATA:etypes2}\}, %{USERNAME:emailuser}@%{HOSTNAME:emailhost} for %{GREEDYDATA:kuser}/%{GREEDYDATA:kuserhost}%{GREEDYDATA:message}


For Testing Grok Against Your Files

A Great Article on Using GrokReader for Record Oriented Processing

More About Grok