Created 10-05-2017 01:11 PM
I have an Text file contain the text like
[16 Aug 2017 12:13:50,665] :INFO :UDPListener : UDP Listener ::: Receiver Node [ 0.0.0.0/3333 ] , Sender Node [ 20f:feb:1:0:0:0:0:10e ] , Message [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] .
I want to split this line to Date : [16 Aug 2017 12:13:50,665] , Sender: [ 20f:feb:1:0:0:0:0:10e ] , Receve : [ 0.0.0.0/3333 ], and Message: [<30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] . Further i want to split the message part into some sub filed. Help me out is it possible to do with regular expression or i have to create the custom processor for this. please help me how i can do it ? i also want to save different filed to different text file. for further use of data analysis.
Created on 10-05-2017 03:08 PM - edited 08-17-2019 09:32 PM
Hi @Sumit Sharma,
you can use replace text processor to extract and replace text as per your requirement.
Change the search value property to:-
(.+?)\s+:INFO.*Receiver Node\s+(\[.*\])\s+(?=,).*Sender Node\s+(\[.*\])\s+(?=,).*Message\s+(\[.*\])$
Change Replacement Value property to:-
Date: $1 ,sender: $3,Receve: $2, Message: $4
ReplaceText processor Configs:-
Input :-
[16 Aug 2017 12:13:50,665] :INFO :UDPListener : UDP Listener ::: Receiver Node [ 0.0.0.0/3333 ] , Sender Node [ 20f:feb:1:0:0:0:0:10e ] , Message [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)]
Output:-
Date: [16 Aug 2017 12:13:50,665] ,sender: [ 20f:feb:1:0:0:0:0:10e ],Receve: [ 0.0.0.0/3333 ], Message: [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)]
So this processor works dynamically according to the ff and replaces the content with your specifications.
Created 10-05-2017 02:27 PM
Hi @Sumit Sharma,
For the given data, Replace Text processor will do the job
by tokenizing the data with given Regular expression syntax you can replace the text.
(?s)(^\[.*\]) :(.*?):(.*?):(.*?):(.*?):(.*?): Receiver Node(.*?), Sender Node(.*?), Message(.*?)$
and the replacement text for the same can be :
Date : $1 , Sender: $7, Receve : $8, Message: $9
Hope this helps !!
Created on 10-05-2017 03:08 PM - edited 08-17-2019 09:32 PM
Hi @Sumit Sharma,
you can use replace text processor to extract and replace text as per your requirement.
Change the search value property to:-
(.+?)\s+:INFO.*Receiver Node\s+(\[.*\])\s+(?=,).*Sender Node\s+(\[.*\])\s+(?=,).*Message\s+(\[.*\])$
Change Replacement Value property to:-
Date: $1 ,sender: $3,Receve: $2, Message: $4
ReplaceText processor Configs:-
Input :-
[16 Aug 2017 12:13:50,665] :INFO :UDPListener : UDP Listener ::: Receiver Node [ 0.0.0.0/3333 ] , Sender Node [ 20f:feb:1:0:0:0:0:10e ] , Message [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)]
Output:-
Date: [16 Aug 2017 12:13:50,665] ,sender: [ 20f:feb:1:0:0:0:0:10e ],Receve: [ 0.0.0.0/3333 ], Message: [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)]
So this processor works dynamically according to the ff and replaces the content with your specifications.
Created 10-05-2017 05:55 PM
Tthank you @Shu, it work but only for first line rest of the line remain same. I used the GetFile processors to read the text file location at /home/sumit/myfile/mylog.txt
this time looking for the output like.
Date : Sender: Receiver Node Message:
[16Aug201712:13:50,665] [20f:feb:1:0:0:0:0:10e] [0.0.0.0/3333] [<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-]
[16Aug201712:13:50,665] [20f:feb:1:0:0:0:0:10e] [0.0.0.0/3333] [<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-]
[16Aug201712:13:50,665] [20f:feb:1:0:0:0:0:10e] [0.0.0.0/3333] [<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-]
[16Aug201712:13:50,665] [20f:feb:1:0:0:0:0:10e] [0.0.0.0/3333] [<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-]
it read each line and produce the output. thank you
Created on 10-05-2017 07:41 PM - edited 08-17-2019 09:31 PM
I think right now your flow looks like
GetFile-->SplitText(splits 1 line as separate flowfile)-->Replacetext(to prepare your content)
You need to have the below processors to get your desired result.
Final Flow:-
GetFile-->SplitText(splits 1 line as separate flowfile)-->Replacetext(to prepare your content)-->ExtractAttributes(to get contents as attributes)-->ReplaceText(to replace attribute values as content of ff)-->MergeContent(to merge the ff as one with header).
Extract text processor:-
After looking at your output you just want all the values of the content to be stored as seperate for this case first we need to extract contents of ff as attributes of ff.
by adding new properties to the processor
date as
Date:\s+(.*)\s+(?=,)
Message as
Message:\s+(.*?)$
Receve as
Receve:\s+(.*?)(,)
sender as
sender:\s+(.*?)(,)
once we extract the contents of ff as attributes then we need to use
ReplaceText Processor:-
change Replacement Value to
${date} ${receiver} ${Message} ${sender}
then change Replacement Strategy property to
Always Replace
config screenshot:-
Input:-
Date: [16 Aug 2017 12:13:50,665] ,sender: [ 20f:feb:1:0:0:0:0:10e ],Receve: [ 0.0.0.0/3333 ], Message: [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)]
output:-
[16 Aug 2017 12:13:50,665] [ 0.0.0.0/3333 ] [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] [ 20f:feb:1:0:0:0:0:10e ]
Once we replace values then use
Merge content processor:-
To merge the flowfiles to one(depends on your requirement).
Change the below properties
Delimiter Strategy to
Text
Header to (as per your requirements and do shift+enter to insert new line)
Date : Sender: Receiver Node Message:
in my processor i kept minimum group size as 500 B , so this processor will waits until the queue size before merge content to 500 B and merges all the ff to one and gives the merged ff.
Input:-
in my case every ff is 170 B now so the processor waits for 3 ff then the queue size is 520B
[16 Aug 2017 12:13:50,665] [ 0.0.0.0/3333 ] [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] [ 20f:feb:1:0:0:0:0:10e ]
Output:-
your desired output 🙂
Date : Sender: Receiver Node Message: [16 Aug 2017 12:13:50,665] [ 0.0.0.0/3333 ] [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] [ 20f:feb:1:0:0:0:0:10e ] [16 Aug 2017 12:13:50,665] [ 0.0.0.0/3333 ] [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] [ 20f:feb:1:0:0:0:0:10e ] [16 Aug 2017 12:13:50,665] [ 0.0.0.0/3333 ] [ <30>Aug 16 12:13:50 as-pp-aa[1761]: %DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)] [ 20f:feb:1:0:0:0:0:10e ]
Configs:-
You can refer to below links to configure Merge content processor
https://community.hortonworks.com/questions/64337/apache-nifi-merge-content.html
https://stackoverflow.com/questions/34958347/mergecontent-with-nifi-inconsistent-length
Flow Screenshot:-
Created 10-05-2017 06:06 PM
And if i use the FetchFile Processor then how i can configure processor ? I have receive an error " Upstream Connections is invalid because Processor requires an upstream connection but currently has none"
Created 10-05-2017 06:29 PM
Flow should be:-
ListFile(sucess)---> FetchFile--->SplitText--->ReplaceText
Created 10-05-2017 07:47 PM
Will you please send me an link how to configure these processors. I am new in nifi.
Created 10-05-2017 11:28 PM
I think there are no links to share but i have attached my .xml file, you can download and upload that xml change to that to your requirements.
you can refer to below link to how to import xml file into your nifi canvas
Created 10-11-2017 07:07 PM
Thank you for your template .. it help me but i use the putFile processor to save the record , it replace the previous record every time. i don't want to replace the text.
when it match the regular expiration it append the text.
input is
Output is :
<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)
<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)
<30>Aug1612:13:50as-pp-aa[1761]:%DAEMON-6-SNMP_TRAP_LINK_UP: ifIndex 669, ifAdminStatus up(1)