I have a scenario where I am looking to ingest the same configuration file every time a flow runs.
Here is a simple example:
I have flow #1 watching for a data file to be dropped on s3. When a file is found it moves the file around and then ultimately drops a message on a kafka queue specific to that data source. Flow #2 is consuming from that kafka queue and runs a series of Hive SQL commands to load the data to necessary tables. I am looking to have the Hive SQL itself sourced from files on the filesystem/git so that the SQL is controlled in our configurations. Now I know I can store the SQL statement(s) in custom properties files that Nifi picks up based on the nifi.variable.registry.properties property (I have that working), however I don't really want to store large SQL commands in Java property files unless I have to. Also those property files appear to only be picked up on Nifi service restart. Therefore the goal is to ingest these SQL configuration files every time the flow runs. This is what I have looked into...
GetFile - My first thought was to use GetFile right after the ConsumeKafka process gets a new message BUT GetFile appears to not allow input connections into it.
ListFile/FetchFile - My second thought was to use ListFile/FetchFile but it appears that ListFile only picks up newly modified files and therefore the flow could pull in the configuration files the first run but then all subsequent runs would fail as the file would not be modified. Also similarly to the GetFile processor it appears you can not have an input link to the ListFile Processor.
ReplaceTextWithMapping - This was my last thought which was to replace everything in the flowfile with the contents of the "mapping" file but this processor appears to work on newline record delimitation and tab delimited columns. This doesn't sound like it will work nicely with multi-line SQL.
So....any thoughts on how I could accomplish this task?