Support Questions

saminathancs17 · ‎11-10-2016

Hi Team,

I have a requirement Where i have to extract first 5 records from a file(Sample.CSV, This file contain 100 rows and 5 column for each row)

Out of 5 record, each record of the 2nd column contain value as "Yes" then I want add a ATTRIBUTE for that file "Is_valid=Y" else "Is_valid=N"

Ex:

India,YES,Asia

USA,YES,USA

UK,YES,UK

India1,YES,Asia

USA1,YES,USA

I did following flow, It is working for record level.

GetFile -> Split Line -> Extract Text -> RouteOnAttribte -> UpdateAttribute

But I dont want to do this check for all the record, I need to do this check only for first 5 record and assign the Valid flag for that file.

Please help me on this.

bwilson · ‎11-11-2016

Hi @Saminathan A

One thing you can do is drop the SplitLine processor and go straight to the ExtractText processor where you can use a regex to pull out the first 5 lines via a regex. Then you can use the groups within that regex to work on the individual groups (e.g., the first 5 lines) in the UpdateAttribute processor. This regex should work for you: ^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n.*

View solution in original post

bwilson · ‎11-11-2016

Hi @Saminathan A

One thing you can do is drop the SplitLine processor and go straight to the ExtractText processor where you can use a regex to pull out the first 5 lines via a regex. Then you can use the groups within that regex to work on the individual groups (e.g., the first 5 lines) in the UpdateAttribute processor. This regex should work for you: ^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n.*

saminathancs17 · ‎11-11-2016

Thanks Brandon Wilson

I tried your suggestion it is working for me. Small correction in regex.

The below one is working for me (Please enable multi-line option in ExtractText configuration )

"regex: (.*)\n(.*)\n(.*)\n(.*)\n(.*) "

Cloudera Community

Support Questions

How to extract first 5 record from flow file using Nifi Processor?