We are using Apache NIFI for basic data extracting or dropping work in different systems. Now trying to leverage NIFI data manipulation technique and facing one issue. Please refer below my use case -
I have a dataset in csv of 10 columns, I want to filter my data by putting condition on some of my columns let say 6 columns, while other 4 should not participate in that condition.
Approach Tried :
We are using ExtractText for putting condition and filtering data.
For column selection we are using ReplaceText and selecting only required columns(6), but this is deleting other columns(4).
We are able to achieve our goal of removing rows using given condition by ExtractText but losing other 4 columns data too due to ReplaceText processor.
We cant afford to lose other 4 columns, as we require all the 10 columns in our final output but right now only 6 columns are getting returned.
Can we pass header values of dataset so that extract text processor can work only on the respective defined columns?
Is there can other processor which can be used here?
Please let me know if more information is required.
Refer attached flow for details.
If you could provide the configurations of your splitText, ReplaceText, and ExtractText processors along with a before and after CSV example, it may be easier to determine what is going on and where changes could be suggested.