- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How ingest and group multiline logs files with nifi ?
- Labels:
-
Apache NiFi
Created ‎05-24-2016 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For example I've got this 9 lines in input
24/05/2016 13:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'
fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement
fr.data.exception.SLDNotFoundException: Mini site inconnu
at fr.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89)
at fr.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60)
at fr.ressources.MailRessource.envoyerMailSignalementContenuInaproprie(MailRessource.java:41)
24/05/2016 15:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'
fr.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement
fr.data.exception.SLDNotFoundException: Mini site inconnu
and I'd like with an "extract text" processor to have a property with the value beginning with "24/05/2016 13:40:18,739 ERROR..." and ending just before the next timestamp "24/05/2016 15:40:18,739... " so with the first 6 input lines.
and another property beginning at the second timestamp and ending at the end of the input lines so with the last three input lines.
Is it possible to do this with Nifi ?
Thanks
Created on ‎05-24-2016 03:41 PM - edited ‎08-18-2019 04:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can do this by using ReplaceText to replace ^(\d{2}\/\d{2}\/\d{4}) with some delimiter not in the set (e.g. ~$1), ie. prepend a magic character to the beginning on each Real line.
You can then use SplitContent by the byte you chose to prepend with. This gives you flow files for each log entry.
However, this can be a little heavy. Make sure you're running the latest version of NiFi, and if you're working with large log files, you may need to consider increasing file handle limits.
The flow (template here: split-multi-line-example.xml) works for prepending and splitting. You can see here that 2 flowfiles have come out of the 5 line log file sample I put in.
Created on ‎05-24-2016 03:41 PM - edited ‎08-18-2019 04:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can do this by using ReplaceText to replace ^(\d{2}\/\d{2}\/\d{4}) with some delimiter not in the set (e.g. ~$1), ie. prepend a magic character to the beginning on each Real line.
You can then use SplitContent by the byte you chose to prepend with. This gives you flow files for each log entry.
However, this can be a little heavy. Make sure you're running the latest version of NiFi, and if you're working with large log files, you may need to consider increasing file handle limits.
The flow (template here: split-multi-line-example.xml) works for prepending and splitting. You can see here that 2 flowfiles have come out of the 5 line log file sample I put in.
Created ‎05-25-2016 06:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The test is KO.
Before "replace text"
25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu at fr.pe.empl.service.da016.recruteur.minisite.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89) at fr.pe.empl.service.da016.recruteur.minisite.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60) at fr.pe.empl.service.da016.recruteur.minisite.ressources.MailRessource$Proxy$_$_WeldSubclass.envoyerMailSignalementContenuInaproprie(MailRessource$Proxy$_$_WeldSubclass.java) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu
After "replace text" and the magic character "£|£|£|"
£|£|£| 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu at fr.pe.empl.service.da016.recruteur.minisite.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89) at fr.pe.empl.service.da016.recruteur.minisite.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60) at fr.pe.empl.service.da016.recruteur.minisite.ressources.MailRessource$Proxy$_$_WeldSubclass.envoyerMailSignalementContenuInaproprie(MailRessource$Proxy$_$_WeldSubclass.java) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) £|£|£| 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu
But after "split text" Output claim has no change. It does not split... Have you got an idea ? Below split prperties :
Text
£|£|£|
false
Trailing
Thanks
Created ‎05-26-2016 10:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Thierry Vernhet I've added a template and screenshot of a worked example, which should make it clearer. I suspect the problem you're seeing is around the relation being used to output from the SplitContent processor. If you use the original, or worse, both outputs you will just get the original content back.
Note also that I've used the "Leading" location in my template, since the marker is inserted at the front of a line, and have also used Line-By-Line evaluation in the marker replace text for better memory usage.
Created ‎05-26-2016 02:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for your answer. I understand now. But I cannot ignore relation ship "original" because without this relation Nifi doesn't validate my processor. How can you use "splits" relationship without the "original" one ?
Hope It's my last question for this.
Created ‎05-26-2016 02:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The way to deal with this is to mark the original relation as auto-terminated in the SplitContent settings tab.
Created ‎05-26-2016 02:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wonderful
Now it's ok Simon
Created ‎05-25-2016 06:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, I'm going to test your solution.
