Created on 01-11-2019 06:53 PM - edited 09-16-2022 07:03 AM
Hi,
I'm trying to connect to different databases that all have the same structure and execute the same query on them. However, i can't figure out how to give this to nifi execute SQL processor and i'm having to create a processor for every connection to the database. IS there a way for me to create a list of connection strings and then simply loop through them?
i.e. i need to dynamically assign the connection pooling service to the processor. How?
Created 01-12-2019 02:58 AM
Starting from NiFi-1.7 version we can dynamically assign connection pooling service by using
DBCPConnectionPoolLookup controller service.
From documentation:
1.Provides a DBCPService that can be used to dynamically select another DBCPService. This service requires an attribute named 'database.name' to be passed in when asking for a connection, and will throw an exception if the attribute is missing. 2.The value of 'database.name' will be used to select the DBCPService that has been registered with that name. 3.This will allow multiple DBCPServices to be defined and registered, and then selected dynamically at runtime by tagging flow files with the appropriate 'database.name' attribute
If you are using NiFi version <1.7.0 then we need to have to define each DBCPConnection controller service for each connection of ExecuteSql processor.
Created 01-12-2019 02:58 AM
Starting from NiFi-1.7 version we can dynamically assign connection pooling service by using
DBCPConnectionPoolLookup controller service.
From documentation:
1.Provides a DBCPService that can be used to dynamically select another DBCPService. This service requires an attribute named 'database.name' to be passed in when asking for a connection, and will throw an exception if the attribute is missing. 2.The value of 'database.name' will be used to select the DBCPService that has been registered with that name. 3.This will allow multiple DBCPServices to be defined and registered, and then selected dynamically at runtime by tagging flow files with the appropriate 'database.name' attribute
If you are using NiFi version <1.7.0 then we need to have to define each DBCPConnection controller service for each connection of ExecuteSql processor.
Created 01-12-2019 11:18 AM
@shu
Many thanks for responding--however, with this solution I will need multiple ExecuteSQL processors each specifying the database.name attribute value, correct? I'll end up with dozens of ExecuteSQL processors that have the same SQL but different Database.name.
Is my understanding correct or am i missing a trick?
I guess, the question is, how/where do i set the database.name attribute? I've configured the service, but how is the execute SQL processor configured?
Created 01-12-2019 06:53 PM
If you are using DBCPConnectionPoolLookup controller service for ExecuteSQL processor then each flowfile feeding to ExecuteSQL processor needs to have database.name attribute with the value of DBCPconnectionpool name that is given.
-
Now based on database.name attribute value ExecuteSQL processor is going to select connectionpool dynamically, so we are going to use one ExecuteSQL processor using multiple DBCPconnectionPool dynamically.
-
Use UpdateAttribute Processor to set database.name attribute to the flowfile and configure/enable DBCPConnectionPoolLookup service select this service in ExecuteSQL processor.
Created 01-13-2019 02:11 PM
This really helps--and its the solution that should work. However, every time i give the SQL SELECT statement in the flowfile content (using generateflowfile processor), the ExecuteSQL processor throws an error. The database.name attribute is set and the select statement is a simple
select a from table
the error is below. any idea what i'm missing?
[Timer-Driven Process Thread-5] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=15613b27-66e0-101a-ab78-86906b96182b] ExecuteSQL[id=15613b27-66e0-101a-ab78-86906b96182b] faile d to process session due to org.apache.nifi.processor.exception.FlowFileHandlingException: StandardFlowFileRecord[uuid=ed4af901-f621-4aab-b7db-aad8c72e41ca,claim=StandardContentClaim [resourceClaim=StandardResou rceClaim[id=1547378310570-136, container=default, section=136], offset=890898, length=10357],offset=9267,name=ed4328e2-1284-4ff3-b252-f7de7c92a94e,size=1090] is not known in this session (StandardProcessSession[ id=315712]); Processor Administratively Yielded for 1 sec: org.apache.nifi.processor.exception.FlowFileHandlingException: StandardFlowFileRecord[uuid=ed4af901-f621-4aab-b7db-aad8c72e41ca,claim=StandardContentCla im [resourceClaim=StandardResourceClaim[id=1547378310570-136, container=default, section=136], offset=890898, length=10357],offset=9267,name=ed4328e2-1284-4ff3-b252-f7de7c92a94e,size=1090] is not known in this session (StandardProcessSession[id=315712])
Created 01-21-2019 10:41 AM
The solution worked! I had some strange error my side that disappeared after a clean setup of the flow. Thanks @Shu!
Created on 04-19-2020 10:41 PM - edited 04-19-2020 10:42 PM
I have written a blog on this, Kindly refer to blog to setup dbcp connection pool Lookup controller service and execute same query in multiple databases.
Please follow this link, it is with an example with step by step instructions to setup the same:
https://bigdata-galaxy.blogspot.com/2020/04/nifi-querying-multiple-databases-using.html
Created 08-01-2019 10:33 PM
Can you please upload the template xml. I have similar requirement and it will help me start with.