Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to extract first 5 record from flow file using Nifi Processor?

Solved Go to solution

How to extract first 5 record from flow file using Nifi Processor?

New Contributor

Hi Team,

I have a requirement Where i have to extract first 5 records from a file(Sample.CSV, This file contain 100 rows and 5 column for each row)

Out of 5 record, each record of the 2nd column contain value as "Yes" then I want add a ATTRIBUTE for that file "Is_valid=Y" else "Is_valid=N"

Ex:

India,YES,Asia

USA,YES,USA

UK,YES,UK

India1,YES,Asia

USA1,YES,USA

I did following flow, It is working for record level.

GetFile -> Split Line -> Extract Text -> RouteOnAttribte -> UpdateAttribute

But I dont want to do this check for all the record, I need to do this check only for first 5 record and assign the Valid flag for that file.

Please help me on this.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to extract first 5 record from flow file using Nifi Processor?

Hi @Saminathan A

One thing you can do is drop the SplitLine processor and go straight to the ExtractText processor where you can use a regex to pull out the first 5 lines via a regex. Then you can use the groups within that regex to work on the individual groups (e.g., the first 5 lines) in the UpdateAttribute processor. This regex should work for you: ^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n.*

2 REPLIES 2

Re: How to extract first 5 record from flow file using Nifi Processor?

Hi @Saminathan A

One thing you can do is drop the SplitLine processor and go straight to the ExtractText processor where you can use a regex to pull out the first 5 lines via a regex. Then you can use the groups within that regex to work on the individual groups (e.g., the first 5 lines) in the UpdateAttribute processor. This regex should work for you: ^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n.*

Re: How to extract first 5 record from flow file using Nifi Processor?

New Contributor

Thanks Brandon Wilson

I tried your suggestion it is working for me. Small correction in regex.

The below one is working for me (Please enable multi-line option in ExtractText configuration )

"regex: (.*)\n(.*)\n(.*)\n(.*)\n(.*) "

Don't have an account?
Coming from Hortonworks? Activate your account here