Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Extract attribute from a textfile and push it to PostgreSQL using NiFI

Explorer

It seems there are may NiFi expert in this community. I am new in NiFi, so it will be highly appreciated if anyone can help me in the following issue, please

I have a text file with millions of line without header in it . The file is look like

2017-04-27 09:07:20.517342 10.69.1.138 -> 10.69.1.100 <4>Apr 27 03:37:13 kernel: [ 78.294310] TS|4294745280|00:14:5a:03:29:33|RSSI_BCN|34
2017-04-27 09:07:20.521313 10.69.1.138 -> 10.69.1.100 <4>Apr 27 03:37:13 kernel: [ 78.296316] TS|4294745284|00:14:5a:03:26:bd|RSSI_BCN|24


So, I need to extract the attributes from these text file and want to store it in PostgreSQL table using NiFi.

the attributes from the text files I am interested in :

  • timestamp: 2017-04-27 09:07:20.517342
  • source_ip:10.69.1.138
  • destination_ip: 10.69.1.100
  • severty_label: 4
  • mac_addresse:00:14:5a:03:29:33
  • obm: RSSI_BCN
  • dBm:34

These are the same field I want in the PostgreSQL table as well. Or it could be ok if anybody just can help how to process the textfile (with my desired extracted attribute ) and save it as a , separated csv file into a folder. Then I can import it to postgreSQL manually. 

1 REPLY 1

@DarkStar   Looks like I already responded to other post with the Regex Mappings I created for you:

 

timestamp :  ([0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1]) (2[0-3]|[01][0-9]):([0-5][0-9]):([0-9]{2}).([0-9]{6})) .*

source_ip : .* (.*) -> .*

dest_ip : .* -> (.*) <.*

Mac : ^.*] TS\|[0-9]{10}\|([a-fA-F0-9:]{17}|[a-fA-F0-9]{12})\|.*$

dbm : ^.*\|(.*)$

 

Here is a template I use to extract text and putSql:

 

https://github.com/steven-dfheinz/NiFi-Templates/blob/master/CSV_to_SQL_Demo.xml

 

The demo is a simple 2 column csv but the concept is the same.  Send in the text, split the text, get the values from each line, create an insert query using the values, and execute it.