Support Questions

Find answers, ask questions, and share your expertise

how to extract specific lines in nifi from log files

avatar
New Contributor

Hi all,

 

New in NiFi. Hence need guidance on achieving the desired result.

Scenario:

1. Multiple log files 

2. each log file contains many lines 

 

Requirement:

1. Read each log file and extract only ERROR line from log files.

2 REPLIES 2

avatar
Contributor

hello @Anurag007 

 

Your description is a little bit 'dry'. Anyway, you can probably do what you want with the following processors:

- getFile (or better, listFile + fetchFile) to get the content of your files

- routeOnContent, which allows you to define some routing rules based on file content using regexp

 

You will find easily many examples of how to use these processors, probably using the search feature of this site

avatar
Master Mentor

@Anurag007 

 

You did not share how your logs are getting in to your NiFi.

But once ingested, you could use a PartitionRecord processor using one of the following readers to handle parsing your log files:
- GrokReader
- SyslogReader
- Syslog5424Reader

You can then use your choice of Record Writers to output your individual split log outputs.
You would then add one custom property that is used to group like log entries by the log_level
This custom property will become a new FlowFile attribute on the output FlowFiles.

You can then use a RouteOnAttribute processor to filter out only FlowFiles where the log_level is set to ERROR.


Here is a simple flow I created that tails NiFi's app log and partitions logs by log_level and and then routes log entries for WARN or ERROR.

Screen Shot 2020-12-23 at 4.25.26 PM.png

I use the GrokReader with the following GrokExpression

%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:thread}\] %{DATA:class} %{GREEDYDATA:message}

I then chose to use the JsonRecordSetWriter 
My dynamic i added  property in the

Property  = log_level
Value = /level


In my RouteOnAttribute processor, I can route based on that new "log_level" attribute that will exist on each partitioned FlowFile using two dynamic property which each become a new relationship:

property = ERROR
value = ${log_level:equals('ERROR')}

property = WARN
value = ${log_level:equals('WARN')}

 

Hope this helps,

Matt