- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
[Nifi] How do I split comma separrated text file not for one line, but for a several line files?
- Labels:
-
Apache NiFi
Created on ‎05-05-2022 08:20 AM - edited ‎05-05-2022 09:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello! Sorry for my english. I am completely new to nifi and I am learning SplitText processor.
So here's the case. I have the comma separated txt file, something like this:
KeyWord, SomeInformation <---1st line is schema.
KeyWord1, "information"
KeyWord2, "information"
KeyWord1, "another information"
KeyWord2, "another information"
and so on.
So the question is how can I split this file on a few files based on KeyWord? So the every line with KeyWord1 go to one file, every line with KeyWord2 go to another file and so on?
Thank you beforehand!
Created ‎05-05-2022 12:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if this is a csv file where the first line is the header, you can easily split the source into two flowfiles: one containing all keyword1 rows and another containing all keyword2 rows using QueryRecord Processor. After you set you record writer\reader to CSV you can create two dynamic properties representing each keyword, and set the query as follows:
Keyword1: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord1'
Keyword2: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord2'
Created ‎05-05-2022 12:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if this is a csv file where the first line is the header, you can easily split the source into two flowfiles: one containing all keyword1 rows and another containing all keyword2 rows using QueryRecord Processor. After you set you record writer\reader to CSV you can create two dynamic properties representing each keyword, and set the query as follows:
Keyword1: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord1'
Keyword2: SELECT * FROM FLOWFILE WHERE KeyWord like 'KeyWord2'
Created ‎05-05-2022 11:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello! tank you for such a detailed answer. Will it work if I have a .txt file as a source? Or should I convert it to csv somethow first?
Created ‎05-06-2022 06:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It should not matter if the format of the text inside is like csv.
