I am trying to Join multiple tables using NiFi. The datasource may be MySQL or RedShift maybe something else in future. Currently, I am using ExecuteSQL processor for this but the output is in a Single flowfile. Hence, for terabyte of data, this may not be suitable. I have also tried using generateTableFetch but this doesn't have join option.
Here are my Questions:
- Is there any alternative for ExecuteSQL processor?
- Is there a way to make ExecuteSQL processor output in multiple flowfiles? Currently I can split the output of ExecuteSQL using SplitAvro processor. But I want ExecuteSQL itself splitting the output
- GenerateTableFetch generates SQL queries based on offset. Will this slows down the process when the dataset becomes larger?
Please share your thoughts. Thanks in advance