Support Questions

Find answers, ask questions, and share your expertise

Deleting specific lines of an csv file using Apache NIFI

avatar
Contributor

Hi,

I ingested csv files using ListFile and FetchFile processors.

Those files contain dates of 2016 in it which should all be deleted before saving it in cassandra.

The table looks like this:

42590-bildschirmfoto-vom-2017-11-14-142103.png

42588-bildschirmfoto-vom-2017-11-14-141416.png

42584-bildschirmfoto-vom-2017-11-14-141416.png

I got a problem with extracting/deleting these lines:

It looks like this: 42586-bildschirmfoto-vom-2017-11-14-141356.png

42589-bildschirmfoto-vom-2017-11-14-141356.png

with following properties in RouteText (testing it out with different options)


42587-bildschirmfoto-vom-2017-11-14-142103.pngIt is still sending all lines to the next processor. (or none)

Do you have an idea of an solution?

Thanks in advance!

best regards

1 ACCEPTED SOLUTION

avatar
Master Guru

@Salda Murrah

if you want to filter out 2016 and 2017 records then in route on content processor change the below properties

Routeoncontent with Contains as Matching Strategy:-

Keep Matching Strategy as Contains 
Route Strategy as Route to each matching Property Name

add new properties

1.2016 as 2016 //check for content if it contains 2016 then route to this relation
2.2017 as 2017 //check for content if it contains 2017 then route to this relation

Routeoncontent configs:-

42591-routeoncontent.png

(or)

Routeoncontent with RegularExpression as Matching Strategy:-

If you want to check the contents of flow file with regular expressions then

Keep Matching Strategy as Matches Regular Expression
Route Strategy as Route to each matching Property Name

add new properties

1.2016 as ^.*;.*2016.*;.*$  //check for content if it contains 2016 then route to this relation
2. 2017 as ^.*;.*2017.*;.*$ //check for content if it contains 2017 then route to this relation

Routeoncontent with RegularExpression Config:-

42592-roc-regex.png

Flow:-

ListFile --> FetchFile --> SplitText //split into 1 line --> RouteonContent //you can use either Contains (or) matches regular expression As Matching Strategies --> .... -->PutCassandraQL

View solution in original post

2 REPLIES 2

avatar
Master Guru

@Salda Murrah

if you want to filter out 2016 and 2017 records then in route on content processor change the below properties

Routeoncontent with Contains as Matching Strategy:-

Keep Matching Strategy as Contains 
Route Strategy as Route to each matching Property Name

add new properties

1.2016 as 2016 //check for content if it contains 2016 then route to this relation
2.2017 as 2017 //check for content if it contains 2017 then route to this relation

Routeoncontent configs:-

42591-routeoncontent.png

(or)

Routeoncontent with RegularExpression as Matching Strategy:-

If you want to check the contents of flow file with regular expressions then

Keep Matching Strategy as Matches Regular Expression
Route Strategy as Route to each matching Property Name

add new properties

1.2016 as ^.*;.*2016.*;.*$  //check for content if it contains 2016 then route to this relation
2. 2017 as ^.*;.*2017.*;.*$ //check for content if it contains 2017 then route to this relation

Routeoncontent with RegularExpression Config:-

42592-roc-regex.png

Flow:-

ListFile --> FetchFile --> SplitText //split into 1 line --> RouteonContent //you can use either Contains (or) matches regular expression As Matching Strategies --> .... -->PutCassandraQL

avatar
Contributor

Thank you very much! It worked very well for me!