Created on 11-24-2016 03:37 PM - edited 09-16-2022 03:49 AM
The situation I have is the need to keep a NoSQL data store in sync with a legacy database with minimal intrusiveness on the operational legacy environment. Only selected entries in the legacy database need to be propagated to the NoSQL database with some data transformation/enrichment.
I'm considering using NiFi for this scenario. My thought is to use TailFile processor to 'tail' the legacy database log file and have it act only on certain file entries. Conceptually, the filtering could be done via the Expression Language. Does NiFi TailFile Processor support Expression Language to filter the file entries of interest? If not, how else can this be done?
Thanks for your help in advance!
Created 11-26-2016 02:50 PM
Big picture
You can use regular expressions also called regex to do this (not expression language). One of the core use cases of NiFi is in fact to filter files on content, or route files on content, or make decisions based on content. The TailFile is used to generate content, but the next downstream processor filters/routes/decides on content. Commonly used processors here are: ExtractText or RouteText (or ReplactText). All of these use regular expressions to match contents of the file. Typically you want to work on a line-by line basis, so you put a SplitText processor before these.
Solution to meet your needs
This article shows how to route log data based on file entries (using regular expression). It should be very close or identical to what you want to do: https://community.hortonworks.com/articles/65027/nifi-easy-custom-logging-of-diverse-sources-in-mer....
Regular Expressions vs Expression Language
Please note that regular expressions are not to be confused with NiFi expression language (which is very powerful in NiFi flows and worth learning).
-
If this is what you were looking for, let me know by accepting the answer; else, let me know of any gaps or remaining questions.
Created 11-26-2016 02:50 PM
Big picture
You can use regular expressions also called regex to do this (not expression language). One of the core use cases of NiFi is in fact to filter files on content, or route files on content, or make decisions based on content. The TailFile is used to generate content, but the next downstream processor filters/routes/decides on content. Commonly used processors here are: ExtractText or RouteText (or ReplactText). All of these use regular expressions to match contents of the file. Typically you want to work on a line-by line basis, so you put a SplitText processor before these.
Solution to meet your needs
This article shows how to route log data based on file entries (using regular expression). It should be very close or identical to what you want to do: https://community.hortonworks.com/articles/65027/nifi-easy-custom-logging-of-diverse-sources-in-mer....
Regular Expressions vs Expression Language
Please note that regular expressions are not to be confused with NiFi expression language (which is very powerful in NiFi flows and worth learning).
-
If this is what you were looking for, let me know by accepting the answer; else, let me know of any gaps or remaining questions.
Created 11-27-2016 01:16 AM
Thank you @Greg Keys - that helps a lot!