Created 01-05-2021 12:35 AM
Hi,
I am using the ExecuteSQL processor in NiFi to write records to PostgreSQL.
The problem I have is when running on multi-thread (for example 10), the processor gets stuck and runs too slow. This doesn't happen in a single thread. However, single thread is not efficient so doesn't work for me.
After getting slow, I stop the processor, terminate it, and start it again. For almost 5-10 seconds everything is ok and then the same problem.
I also have a similar problem with the InvokeHTTP processor.
Thank you in advance.
Created 01-05-2021 11:19 AM
@garoosy
You should look in to using the "ExecuteSQLRecord" instead of "ExecuteSQL" for large volume data. To be efficient here you would have many records in a single FlowFile.
Right now you have a single record per each FlowFile which is not going to be very efficient. The only way for "ExecuteSQL" to handle multiple FlowFile executions in a single connection is if the SQL statement used in every FlowFile is identical. In order to do that the unique values would need to come from FlowFile attributes.
You may find these post helpful:
https://community.cloudera.com/t5/Support-Questions/Nifi-ExectueSQL-how-to-force-a-parameter-to-be-a...
If you have threads that never seem to complete (will see small number in upper right corner of processor (2)), it is best to get a series of thread dumps (4 - 6) to verify thread is not progressing. Then you have to determine if what the thread is waiting on. Did you try setting a "Max Wait Time" on the processor? It defaults to 0 which means it would wait forever.
Hope this helps,
Matt
Created 01-05-2021 11:19 AM
@garoosy
You should look in to using the "ExecuteSQLRecord" instead of "ExecuteSQL" for large volume data. To be efficient here you would have many records in a single FlowFile.
Right now you have a single record per each FlowFile which is not going to be very efficient. The only way for "ExecuteSQL" to handle multiple FlowFile executions in a single connection is if the SQL statement used in every FlowFile is identical. In order to do that the unique values would need to come from FlowFile attributes.
You may find these post helpful:
https://community.cloudera.com/t5/Support-Questions/Nifi-ExectueSQL-how-to-force-a-parameter-to-be-a...
If you have threads that never seem to complete (will see small number in upper right corner of processor (2)), it is best to get a series of thread dumps (4 - 6) to verify thread is not progressing. Then you have to determine if what the thread is waiting on. Did you try setting a "Max Wait Time" on the processor? It defaults to 0 which means it would wait forever.
Hope this helps,
Matt
Created 01-05-2021 12:52 PM
You generally don't need to run multiple threads for that.
Also make sure you size your nodes and cluster appropriately.
https://www.datainmotion.dev/2020/07/sizing-your-apache-nifi-cluster-for.html
Created 01-05-2021 09:43 PM
Thanks for your reply.
I will try it.
Created 01-05-2021 09:47 PM
Thank you Matt!
Altering the "Max Wait Time" value was a game-changing move. I still need to improve it. But the thread problem is fixed now.