The ConsumeKafka processor acts as a Kafka consumer group. So makes sense that you set 30 concurrent tasks (1 per partition) assuming this is a single instance of NiFi. If you had a 3 node NiFi cluster, you would set the concurrent tasks to 10 (10 x 3 nodes = 30 consumers). What version of Kafka Server are you consuming from?
MergeContent merging 100,000 FlowFiles (min). You did not share any component configurations here.
Is the MergeContent configured correctly to make sure each merge generated is 100,000 FlowFiles?
Generally speaking, I would recommend using two MergeContent processors in series to reduce NiFi heap memory usage. The more FlowFiles allocated to MergeContent bins, the more NiFi heap usage. So first MergeContent merging say 20,000 min, followed by second MergeContent merging 5 min would achieve the same but with lower heap usage.
Update Attribute does not touch the content of a FlowFile. It simply updated metadata/attributes about a FlowFile.
PutSFTP throughput is for the most part dependent on the target SFTPP server and the network between NiFi and that SFP server. Most SFTP servers only allow max 10 concurrent collections from the same client. Did you configure this processor with 10 concurrent tasks? Having a NiFi cluster would allow multiple NiFi nodes to send data concurrent to the SFTP server (10 concurrent tasks x 3).
Are you saying the FlowFiles start queuing up at the putSFTP processor eventually leading to backpressure being applies all the way back through your dataflow until you reach the ConsumeKafka processor?
Have you looked at CPU, disk I/O, network bandwidth and speed, NiFi heap usage?
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.