- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Deleting specific lines of an csv file using Apache NIFI
- Labels:
-
Apache NiFi
Created on ‎11-14-2017 01:35 PM - edited ‎08-17-2019 11:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I ingested csv files using ListFile and FetchFile processors.
Those files contain dates of 2016 in it which should all be deleted before saving it in cassandra.
The table looks like this:
I got a problem with extracting/deleting these lines:
with following properties in RouteText (testing it out with different options)
It is still sending all lines to the next processor. (or none)
Do you have an idea of an solution?
Thanks in advance!
best regards
Created on ‎11-14-2017 01:58 PM - edited ‎08-17-2019 11:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if you want to filter out 2016 and 2017 records then in route on content processor change the below properties
Routeoncontent with Contains as Matching Strategy:-
Keep Matching Strategy as Contains
Route Strategy as Route to each matching Property Name
add new properties
1.2016 as 2016 //check for content if it contains 2016 then route to this relation 2.2017 as 2017 //check for content if it contains 2017 then route to this relation
Routeoncontent configs:-
(or)
Routeoncontent with RegularExpression as Matching Strategy:-
If you want to check the contents of flow file with regular expressions then
Keep Matching Strategy as Matches Regular Expression
Route Strategy as Route to each matching Property Name
add new properties
1.2016 as ^.*;.*2016.*;.*$ //check for content if it contains 2016 then route to this relation 2. 2017 as ^.*;.*2017.*;.*$ //check for content if it contains 2017 then route to this relation
Routeoncontent with RegularExpression Config:-
Flow:-
ListFile --> FetchFile --> SplitText //split into 1 line --> RouteonContent //you can use either Contains (or) matches regular expression As Matching Strategies --> .... -->PutCassandraQL
Created on ‎11-14-2017 01:58 PM - edited ‎08-17-2019 11:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if you want to filter out 2016 and 2017 records then in route on content processor change the below properties
Routeoncontent with Contains as Matching Strategy:-
Keep Matching Strategy as Contains
Route Strategy as Route to each matching Property Name
add new properties
1.2016 as 2016 //check for content if it contains 2016 then route to this relation 2.2017 as 2017 //check for content if it contains 2017 then route to this relation
Routeoncontent configs:-
(or)
Routeoncontent with RegularExpression as Matching Strategy:-
If you want to check the contents of flow file with regular expressions then
Keep Matching Strategy as Matches Regular Expression
Route Strategy as Route to each matching Property Name
add new properties
1.2016 as ^.*;.*2016.*;.*$ //check for content if it contains 2016 then route to this relation 2. 2017 as ^.*;.*2017.*;.*$ //check for content if it contains 2017 then route to this relation
Routeoncontent with RegularExpression Config:-
Flow:-
ListFile --> FetchFile --> SplitText //split into 1 line --> RouteonContent //you can use either Contains (or) matches regular expression As Matching Strategies --> .... -->PutCassandraQL
Created ‎11-15-2017 03:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much! It worked very well for me!
