My process flow is like this:
Process A: Read data from system A ( ExecuteSQL) and load into staging table A ( PutSQL)
Process B: Read data from System B ( ExecuteSQL) and load into staging table B ( PutSQL)
Process C: Join data between tables A and B( ExecuteSQL)and generate a file
I want Process C to run only after Process A and Process B are done.
How can I achieve this in NiFi. I looked into Wait/Notify and MergeContent processors and nothing seems to be working.
If Process A and Process B, both are generating one flow file each, make sure that your MergeContent processor has "Minimum Number of Entries" set to 2. Something like below.
However, if you have multiple flow files coming from each process, due to whatever reason, the merge strategy shall be different. Please let know if that works for you or else if we need some other solution if your use case is different.
Thanks for prompt response. Process A and Process B run like this:
ExecuteSQL -> SplitAvro --> ConvertRecord --> PutSQL. So I end up with one flow file per record.
Should I use PutDatabaseRecord instead of PutSQL here? Will PutDatabaseRecord give me one flow file as output upon successful completion of all INSERTs?