Support Questions

Find answers, ask questions, and share your expertise

[Nifi] How do I split comma separrated text file not for one line, but for a several line files?

avatar
Contributor

Hello! Sorry for my english. I am completely new to nifi and I am learning SplitText processor.

 

So here's the case. I have the comma separated txt file, something like this:

KeyWord, SomeInformation   <---1st line is schema.

KeyWord1, "information"

KeyWord2, "information"

KeyWord1, "another information"

KeyWord2, "another information"

 

and so on.

 

So the question is how can I split this file on a few files based on KeyWord? So the every line with KeyWord1 go to one file, every line with KeyWord2 go to another file and so on?

 

Thank you beforehand!

1 ACCEPTED SOLUTION

avatar
Super Guru

if this is a csv file where the first line is the header, you can easily split the source into two flowfiles: one containing all keyword1 rows and another containing all keyword2 rows using QueryRecord Processor. After you set you record writer\reader to CSV you can create two dynamic properties representing each keyword,  and set the query as follows:

Keyword1: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord1'

Keyword2: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord2'

View solution in original post

3 REPLIES 3

avatar
Super Guru

if this is a csv file where the first line is the header, you can easily split the source into two flowfiles: one containing all keyword1 rows and another containing all keyword2 rows using QueryRecord Processor. After you set you record writer\reader to CSV you can create two dynamic properties representing each keyword,  and set the query as follows:

Keyword1: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord1'

Keyword2: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord2'

avatar
Contributor

Hello! tank you for such a detailed answer. Will it work if I have a .txt file as a source? Or should I convert it to csv somethow first?

 

avatar
Super Guru

It should not matter if the format of the text inside is like csv.