Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi: Multiple databases , same query, how do pass database name to the connection string?

avatar
Contributor

Hi,

I'm trying to connect to different databases that all have the same structure and execute the same query on them. However, i can't figure out how to give this to nifi execute SQL processor and i'm having to create a processor for every connection to the database. IS there a way for me to create a list of connection strings and then simply loop through them?

i.e. i need to dynamically assign the connection pooling service to the processor. How?

1 ACCEPTED SOLUTION

avatar
Master Guru

@Yasir Khokhar

Starting from NiFi-1.7 version we can dynamically assign connection pooling service by using

DBCPConnectionPoolLookup controller service.

From documentation:

1.Provides a DBCPService that can be used to dynamically select another DBCPService. This service requires an attribute named 'database.name' to be passed in when asking for a connection, and will throw an exception if the attribute is missing.
2.The value of 'database.name' will be used to select the DBCPService that has been registered with that name. 
3.This will allow multiple DBCPServices to be defined and registered, and then selected dynamically at runtime by tagging flow files with the appropriate 'database.name' attribute

If you are using NiFi version <1.7.0 then we need to have to define each DBCPConnection controller service for each connection of ExecuteSql processor.

View solution in original post

7 REPLIES 7

avatar
Master Guru

@Yasir Khokhar

Starting from NiFi-1.7 version we can dynamically assign connection pooling service by using

DBCPConnectionPoolLookup controller service.

From documentation:

1.Provides a DBCPService that can be used to dynamically select another DBCPService. This service requires an attribute named 'database.name' to be passed in when asking for a connection, and will throw an exception if the attribute is missing.
2.The value of 'database.name' will be used to select the DBCPService that has been registered with that name. 
3.This will allow multiple DBCPServices to be defined and registered, and then selected dynamically at runtime by tagging flow files with the appropriate 'database.name' attribute

If you are using NiFi version <1.7.0 then we need to have to define each DBCPConnection controller service for each connection of ExecuteSql processor.

avatar
Contributor

@shu

Many thanks for responding--however, with this solution I will need multiple ExecuteSQL processors each specifying the database.name attribute value, correct? I'll end up with dozens of ExecuteSQL processors that have the same SQL but different Database.name.

Is my understanding correct or am i missing a trick?

I guess, the question is, how/where do i set the database.name attribute? I've configured the service, but how is the execute SQL processor configured?

avatar
Master Guru

@Yasir Khokhar

If you are using DBCPConnectionPoolLookup controller service for ExecuteSQL processor then each flowfile feeding to ExecuteSQL processor needs to have database.name attribute with the value of DBCPconnectionpool name that is given.

-

Now based on database.name attribute value ExecuteSQL processor is going to select connectionpool dynamically, so we are going to use one ExecuteSQL processor using multiple DBCPconnectionPool dynamically.

-

Use UpdateAttribute Processor to set database.name attribute to the flowfile and configure/enable DBCPConnectionPoolLookup service select this service in ExecuteSQL processor.

avatar
Contributor

This really helps--and its the solution that should work. However, every time i give the SQL SELECT statement in the flowfile content (using generateflowfile processor), the ExecuteSQL processor throws an error. The database.name attribute is set and the select statement is a simple

select a from table

the error is below. any idea what i'm missing?

 [Timer-Driven Process Thread-5] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=15613b27-66e0-101a-ab78-86906b96182b] ExecuteSQL[id=15613b27-66e0-101a-ab78-86906b96182b] faile
d to process session due to org.apache.nifi.processor.exception.FlowFileHandlingException: StandardFlowFileRecord[uuid=ed4af901-f621-4aab-b7db-aad8c72e41ca,claim=StandardContentClaim [resourceClaim=StandardResou
rceClaim[id=1547378310570-136, container=default, section=136], offset=890898, length=10357],offset=9267,name=ed4328e2-1284-4ff3-b252-f7de7c92a94e,size=1090] is not known in this session (StandardProcessSession[
id=315712]); Processor Administratively Yielded for 1 sec: org.apache.nifi.processor.exception.FlowFileHandlingException: StandardFlowFileRecord[uuid=ed4af901-f621-4aab-b7db-aad8c72e41ca,claim=StandardContentCla
im [resourceClaim=StandardResourceClaim[id=1547378310570-136, container=default, section=136], offset=890898, length=10357],offset=9267,name=ed4328e2-1284-4ff3-b252-f7de7c92a94e,size=1090] is not known in this session (StandardProcessSession[id=315712])



avatar
Contributor

The solution worked! I had some strange error my side that disappeared after a clean setup of the flow. Thanks @Shu!

avatar
New Contributor

I have written a blog on this,  Kindly refer to blog to setup dbcp connection pool Lookup controller service and execute same query in multiple databases.

 

Please follow this link, it is with an example with step by step instructions to setup the same:

https://bigdata-galaxy.blogspot.com/2020/04/nifi-querying-multiple-databases-using.html 

 

 

avatar
New Contributor

@Yasir Khokhar

Can you please upload the template xml. I have similar requirement and it will help me start with.