As part of a current ingestion project, I'm looking for suggestions as to the selection of tools for a specific case...
We have an enterprise system which performs analytical runs, triggered by users. For each run, the system can create between 100 and 3,000 individual tables in MS SQL Server. These tables have low volumes (50-500 records each), but are generated fairly rapidly. They do not necessarily have a common structure, though do share many common fields.
Tables for a specific run are named with a common prefix.
A run status table lists these prefixes, with a run completion datetime.
We're looking for a solution which will poll the run status table for run completion, then ingest from all tables which have names matching the run prefix.
I'm thinking that this is likely to be a Kafka based solution, but haven't been able to find examples of cases where data is ingested from a dynamically changing list of input tables.
Any suggestions on where to start investigating would be much appreciated!