Support Questions
Find answers, ask questions, and share your expertise

Serialize QueryDatabaseTable processors

New Contributor

HI, I'm using Nifi 1.7.1 and I have created eighteen QueryDatabaseTables to ingest data from and oracle 12C databases. If I run processors one by one all go well, however, If I start the process group, all the processors start and some work fine, but other generate the follow error:

2018-10-16 18:57:16,247 ERROR [Timer-Driven Process Thread-10] o.a.n.p.standard.QueryDatabaseTable QueryDatabaseTable[id=b55522c9-508c-3675-4476-d68bf07cb955] Unable to execute SQL select query SELECT * FROM MYTABLE due to java.sql.SQLRecoverableException: IO Error: Connection reset: java.sql.SQLRecoverableException: IO Error: Connection reset
java.sql.SQLRecoverableException: IO Error: Connection reset

I'm not be able to change the databases properties to permit more connections. So I would like to start the QueryDatabaseTables one behind other in sequence. is there any way to do this?

I tryed to use a GenerateTableFetch followed by a ExecuteSQL. nothertheless, ExecuteSQL is quite slow. I will apreciate any ideas to resolve this use case.
4 REPLIES 4

Re: Serialize QueryDatabaseTable processors

Super Guru

Can you share your GenerateTableFetch configuration, and your DBCPConnectionPool config (with any sensitive properties redacted of course)? If you are limited in your number of DB connections, and/or if you have a standalone NiFi instance, then the difference between QueryDatabaseTable and GenerateTableFetch -> ExecuteSQL should be negligible... unless they are very large tables and GTF's "count(*)" query is causing you problems.

Re: Serialize QueryDatabaseTable processors

New Contributor

Hi Matt,

Thanks for reply. I rec a video with a test to show you what I see you can see in this link: https://youtu.be/uS4Gay0U-e0

In addition next are the settings of the processors.

Thanks in advance for you help.

nifi-settings.pdf

Re: Serialize QueryDatabaseTable processors

Super Guru

In this case, QueryDatabaseTable is fetching all the rows (5000 at a time from the DB, but only one query), and sending them out one at a time. To avoid the Connection reset error, try setting a Validation Query on your DBCPConnectionPool, perhaps "SELECT 1 FROM DUAL".

The reason GenerateTableFetch is slower is because it will do a separate query (based on OFFSET/LIMIT) to grab 5000 rows each time. This is much slower as the DB will have to handle many more queries (using offsets & limits), instead of a single one. Also it appears you only have a single ExecuteSQL instance, you can make this more concurrent by setting Max Concurrent Tasks on the ExecuteSQL processor to something higher. Having said that, GenerateTableFetch was designed such that the queries can run in parallel, meaning you can distribute the SQL flow files among a NiFi cluster, and each node would have an ExecuteSQL that could perform the queries. On a single NiFi instance, GenerateTableFetch doesn't get you any improvements in performance; in those cases it's usually only necessary if you need to pass incoming flow files to GTF, since QueryDatabaseTable doesn't accept inputs.

You also don't need the ExtractText to get the SQL content into an attribute: if you leave the "SQL select query" property empty, it will assume the SQL is in the content of the flow file.

I guess the bottom line is that for your use case, I agree QueryDatabaseTable is the way to go, hopefully the Validation Query will get the issue cleared up so you can continue.

Re: Serialize QueryDatabaseTable processors

New Contributor

Hi Matt,

Thanks for reply and your advice. I updated the jdbc driver from ojdbc6 to ojdbc8 and now the Connection doesn't reset and work ok. I am going to continue checking the generateTableFetch maybe I can improve the query. thanks a lot for your help.


Regards,

Carlos