Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

when QueryDatabaseTable query the hug table , cause the nifi/processor break down easily

avatar
Rising Star

I ask a question in another page

https://community.hortonworks.com/questions/53941/nifi-querydatabasetables-properties-fetch-size-is....

But i am still puzzled.

In nifi-1.x, 'Max Rows Per Flow File' can control the max number in a flowfile.

But It seems that QueryDatabaseTable still query all the rows into the memory, the split the records into flowfiles. If the table is very hug. The nifi is easy to break down when start QueryDatabaseTable.(database is mysql)

There is no rows limit in select query sql, like select * from table limit 0,100; -- mysql or select * from table where rownum<100;--oracle

The Bulletin advice me to modify 'max_allowed_packet' in my.cnf. But it is useless.

some of the errors:

Unable to execute SQL select query SELECT * FROM employee due to java.sql.SQLException: Unknown character set index for field '25700' received from server.: java.sql.SQLException: Unknown character set index for field '25700' received from server.

The QueryDatabaseTable unable to use in database with hug table.

Thanks for you reply

David.

@Bryan Bende @Matt Burgess

1 ACCEPTED SOLUTION

avatar

Hi,

The recommended way is to use GenerateTableFetch as input processor of QueryDatabaseTable processor. It generates flow files with SQL query to execute. This way, it also allows you to balance the load if you are in a NiFi cluster. In this processor, you can set the partition size to limit the number of rows of each request.

The error you have suggests something different. Maybe the JDBC driver is not fully implemented and is not supporting the properties of this processor. But it does sound to me like a memory issue. Have a try with GenerateTableFetch.

View solution in original post

2 REPLIES 2

avatar

Hi,

The recommended way is to use GenerateTableFetch as input processor of QueryDatabaseTable processor. It generates flow files with SQL query to execute. This way, it also allows you to balance the load if you are in a NiFi cluster. In this processor, you can set the partition size to limit the number of rows of each request.

The error you have suggests something different. Maybe the JDBC driver is not fully implemented and is not supporting the properties of this processor. But it does sound to me like a memory issue. Have a try with GenerateTableFetch.

avatar
Rising Star

Thanks for your advice. I use GenerateTableFetch+ExecuteSQL to achieve my target.