Created 05-24-2016 02:42 PM
For example I've got this 9 lines in input
24/05/2016 13:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'
fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement
fr.data.exception.SLDNotFoundException: Mini site inconnu
at fr.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89)
at fr.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60)
at fr.ressources.MailRessource.envoyerMailSignalementContenuInaproprie(MailRessource.java:41)
24/05/2016 15:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'
fr.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement
fr.data.exception.SLDNotFoundException: Mini site inconnu
and I'd like with an "extract text" processor to have a property with the value beginning with "24/05/2016 13:40:18,739 ERROR..." and ending just before the next timestamp "24/05/2016 15:40:18,739... " so with the first 6 input lines.
and another property beginning at the second timestamp and ending at the end of the input lines so with the last three input lines.
Is it possible to do this with Nifi ?
Thanks
Created on 05-24-2016 03:41 PM - edited 08-18-2019 04:21 AM
You can do this by using ReplaceText to replace ^(\d{2}\/\d{2}\/\d{4}) with some delimiter not in the set (e.g. ~$1), ie. prepend a magic character to the beginning on each Real line.
You can then use SplitContent by the byte you chose to prepend with. This gives you flow files for each log entry.
However, this can be a little heavy. Make sure you're running the latest version of NiFi, and if you're working with large log files, you may need to consider increasing file handle limits.
The flow (template here: split-multi-line-example.xml) works for prepending and splitting. You can see here that 2 flowfiles have come out of the 5 line log file sample I put in.
Created on 05-24-2016 03:41 PM - edited 08-18-2019 04:21 AM
You can do this by using ReplaceText to replace ^(\d{2}\/\d{2}\/\d{4}) with some delimiter not in the set (e.g. ~$1), ie. prepend a magic character to the beginning on each Real line.
You can then use SplitContent by the byte you chose to prepend with. This gives you flow files for each log entry.
However, this can be a little heavy. Make sure you're running the latest version of NiFi, and if you're working with large log files, you may need to consider increasing file handle limits.
The flow (template here: split-multi-line-example.xml) works for prepending and splitting. You can see here that 2 flowfiles have come out of the 5 line log file sample I put in.
Created 05-25-2016 06:46 AM
The test is KO.
Before "replace text"
25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu at fr.pe.empl.service.da016.recruteur.minisite.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89) at fr.pe.empl.service.da016.recruteur.minisite.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60) at fr.pe.empl.service.da016.recruteur.minisite.ressources.MailRessource$Proxy$_$_WeldSubclass.envoyerMailSignalementContenuInaproprie(MailRessource$Proxy$_$_WeldSubclass.java) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu
After "replace text" and the magic character "£|£|£|"
£|£|£| 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu at fr.pe.empl.service.da016.recruteur.minisite.services.impl.MiniSiteServiceImpl.lire(MiniSiteServiceImpl.java:89) at fr.pe.empl.service.da016.recruteur.minisite.services.impl.EnvoiMailSignalementDs3ServiceImpl.envoyerUnMail(EnvoiMailSignalementDs3ServiceImpl.java:60) at fr.pe.empl.service.da016.recruteur.minisite.ressources.MailRessource$Proxy$_$_WeldSubclass.envoyerMailSignalementContenuInaproprie(MailRessource$Proxy$_$_WeldSubclass.java) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) £|£|£| 25/05/2016 08:40:18,739 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' fr.pe.sldng.integration.rest.GestionnaireExceptionSollicitationRest Exception lors du traitement fr.pe.empl.service.da016.recruteur.minisite.data.exception.SLDNotFoundException: Mini site inconnu
But after "split text" Output claim has no change. It does not split... Have you got an idea ? Below split prperties :
Text
£|£|£|
false
Trailing
Thanks
Created 05-26-2016 10:51 AM
Hi @Thierry Vernhet I've added a template and screenshot of a worked example, which should make it clearer. I suspect the problem you're seeing is around the relation being used to output from the SplitContent processor. If you use the original, or worse, both outputs you will just get the original content back.
Note also that I've used the "Leading" location in my template, since the marker is inserted at the front of a line, and have also used Line-By-Line evaluation in the marker replace text for better memory usage.
Created 05-26-2016 02:09 PM
Thanks a lot for your answer. I understand now. But I cannot ignore relation ship "original" because without this relation Nifi doesn't validate my processor. How can you use "splits" relationship without the "original" one ?
Hope It's my last question for this.
Created 05-26-2016 02:23 PM
The way to deal with this is to mark the original relation as auto-terminated in the SplitContent settings tab.
Created 05-26-2016 02:34 PM
Wonderful
Now it's ok Simon
Created 05-25-2016 06:07 AM
Thanks, I'm going to test your solution.