I have a table with 12M rows coming from Netezza which I need to push it into S3. This is how I have the pipeline setup currently:
ExecuteSQL & ConvertRecord have Load Balancing turned on.
MergeContent apparently merges data within each data node. How do I combine flowfiles from all Data Nodes into one flowfile before pushing into S3?
You can make MergeContent run on only the Primary node. As a result, all of your data will be shuffled over primary and will merge on Primary. Hope it helps.