Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

Best Practice for configuring registry flows

avatar
Expert Contributor

Hello team & community 🙂

I am looking into deploying many instances of a flow that is versioned in a git registry via the REST API. My current issue is that I would like to configure each instance of this flow dynamically, though if I version the flow with a Parameter Context attached to it, whenever I deploy instances of it they all end up attaching to a single Parameter Context.

 

I understand this is the expected behaviour as described in the documentation. Therefore, I'd like to know if there is any recommended practice for situations such as mine. 

The best solution I could think of was perhaps configuring the processors in the flow to use parameters but not to actually attach a PC to the process group. This way I could deploy flows in two steps by first importing them and then creating a new parameter context & configuring/attaching it.

 

I would greatly appreciate any advice for this use-case,

Thank you.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Green_ 

Thinking more about challenges mentioned in my previous response.

You could avoid them by creating a parameter-context template on Dev.  This would be a parameter context with all the keys but no assigned values.

Then when you import the flow to prod from dev you can uncheck the box for "Keep Existing Parameter Contexts" so that a new unique named parameter context is created each time you import the flow.  Then you can update that newly generated parameter context with a flow specific name and slow specific values assigned to those parameters that currently have no values.

Back on dev, if you make a change involving a newly introduced parameter key, simply update the parameter-context-template with the new key without an assigned value.  
Now when you change version in dev, you'll get the new key that you just need to assign prod specific flow value to.

 

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

View solution in original post

7 REPLIES 7

avatar
Master Mentor

@Green_ 

When you deploy a Dataflow (that has a parameter context assign to it) from NiFi A via NiFi-Registry to another NiFi B, the parameter context will be added to NiFi 2 if a parameter context of the same exact name does NOT already exist in NiFI 2.   If the Parameter Context with same name already exists, that local parameter context will be used.  

Additionally, if the parameter context of the same name present in the original flow from NiFi has a new parameter name not present in same named pre-existing parameter context on NiFi B, that additional name/value will be added to the existing same named parameter context on NiFi B.

So NiFi / NIFi-Registry was designed with the intent to handle different parameter values per NiFi deployment.  

Now the first time you deploy a flow from NiFi A to NiFi B, you end up with the parameter context from NIFi A being added to NiFi B.  You'll need to update values as needed in NiFi B before starting the dataflow(s) in that Process group.  But new version after that will not be an issue (unless additional new parameter name/value pairs are added.  Those would need to be updated or you could add the new params manually in NiFi 2 before updating version.

I think above solution is better since you'll have the all the parameter name/value pairs when you import the new dataflow from NiFi-Registry, you'll just need to update some values before starting the new dataflow.

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Expert Contributor

Hi @MattWho , thank you for the detailed reply 🙂

I might not have been very clear in my original question so I'll rephrase with more context-

My use-case is not deploying a single instance of a flow from NiFi A to NiFi B, rather I've got just one NiFi instance and I'd like to deploy thousands of instances of the same versioned flow (and perhaps in the future, I'll have more clusters in different regions and would like to deploy the flows there too).

 

In this context, it is problematic for me that Parameter Contexts get created/picked based on the name of the original PC at commit time, since I want each flow instance to have its own unique values for the parameters.

 

Therefore, my current plan is commiting the flows without a PC attached (though with parameters configured in the processors), and at creation time (via my codebase) creating a new PC (uniquely named per flow) with the specific values for that instance.

 

My question was whether there is a more preferable way to do this - creating many instances of a parameterized versioned flow in the same nifi environment and ensuring each instance can get its own unique set of parameters.

 

Thanks for helping out,

Green

avatar
Master Mentor

@Green_ 

The parameter context assigned to a PG does not track as a version control change.  Also the process group name does not track as a versioned change. This is by design so that you can reuse the same version controlled process group over and over and assign a unique parameter context and unique name to each.

For example:

  • Create a new process group named "master" and add a new parameter context to it.
  • Build a simple dataflow and convert some properties to parameters.
  • Version control the Process Group.
  • Drag a new Process Group icon to canvas and select import from NiFi Registry.
  • Select previously version Process Group.
  • Edit Process Group name to "Clone-parameter-2" and change parameter context assigned to it.  Hit apply

You will notice the newly imported and modified Process Group shows no local changes.

Now go back to Master and add a new component inside that Process group.  You will see this change reported as local change.  Commit that PG as a new version of Master.  Soon afterwards you will see "Clone-parameter-2" report a new version as available.  Change version of "Clone-parameter-2" to newer version.  You'll notice that PG name and assigned Parameter context does not change.

NOTE: If you make a change in any process group tied to this single version controlled flow, it will report a local change that you can commit to NiFi-Registry resulting in a new version being available to all others (Master is implied to have any real hierarchy in this example).

NOTE2:  If a change in one process group includes a new parameter being added to that process group's assigned parameter context, when other process groups are updated to that version, that new parameter context will be added to other parameter context automatically for you with value matching what was set in version.  So processor will not be invalid and might have a value assigned you want/need to change.

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt




avatar
Expert Contributor

@MattWho 

So just to verify I understand everything:

1) I start with a dev process group somewhere in my canvas (let us say it is named my-flow), with a parameter context attached to it, say PC-DEV, and then I commit this PG to the registry so I will be able to deploy thousands of unique production copies of it 

2) When I deploy my-flow from the registry in the same nifi environment, the newly created process group will automatically have PC-DEV attached to it (<-- this is what I'm worried about)

3) In order to uniquely set up this flow, I now need to create a new PC which mimics PC-DEV so that I will be able to configure a unique set of parameters for this deployed instance

 

Currently testing on nifi 2.7.2, I actually see that there's a toggle when importing from registry to "Keep existing Parameter Contexts", which partially answers my concern. If it is toggled ON, then the new deployment will automatically use an existing PC. If toggled OFF, it will instead automatically create a new copy of the commit-time PC (without sensitive values) which will be named "<original pc name> (1)" with the number increasing per copy.

The OFF behaviour is more in-line with what I wish to do in my environment, though the naming is still problematic as I'd prefer the PC name to more directly correlate to the deployed instance (when I'll deploy instances via my code, I'll probably add the instance ID to the PC name)

 

Your 2nd note has raised some concerns for me. Did I understand correctly that if I commit a new version for a flow and in it I added a new paramter somewhere in the procsesors, then automatically when I update the other flows that new parameter will be added to all of their respective parameter contexts? (E.G. I update my dev flow with a new parameter, all production copies of the dev parameter context will now have a new parameter added, though with the value being the same as the dev value I commited)

 

Frankly I still feel as though my best course of action is to simply commit the flows without any PC attached but with parameter strings (#{..}) pre-configured. This way when I deploy a flow with my code via the REST api, I can more easily tailor a new parameter context to the new flow. This way there are also no concerns of accidentally attaching default/dev values to production flows. Though if I add new parameters, I will need to manually add them to all the existing PCs.

 

I'm not sure there's any one clear answer to my question so I'll just mark yours as the solution. Thanks Matt 🙂

avatar
Master Mentor

@Green_ 

Considering the number of deployments, it might make most sense for you to do this using multiple rest-api calls.

  1. First to import your version controlled flow (no parameter-context associated with that version controlled flow)
  2. Create a new parameter context with parameters required for that new flow.
  3. Update imported Process Group with new name and updated association with newly created parameter context.

What you have at that point is a new Process group with new unique name and assigned parameter context.  While in NiFi-Registry you still have the dev version controlled PG with no associated parameter-context. 

This presents some new challenges....

  1. Back on your dev system where your source Process group was version controlled with no parameter-conext. Since it is version controlled, if you make a change in DEV (add new configuration that references a new parameter context key/value, all your other Process Groups version controlled against that same NiFi-Registry flow definition in prod will reflect a new version available.  If you "Change version" the Process group will get the change in the flow, but will also revert to NO assigned parameter context.  So you will need to re-assign the appropriate parameter context to that Process Group and update the parameter context with newly referenced parameter.

  2. Back on your dev system where your source Process group was version controlled with no parameter-context.  If you make a change that does not involve any newly introduced parameters, you will still have issue with parameter context being unassociated in dev if you change version.  So you will need to re-assign the appropriate parameter context to that Process Group upon any version change.

  3. On the prod system where you have 1 too many  process groups tied back to this single dev version controlled flow.  If you were to make a change, it would reflect as local change that needs to be committed to version control.  Since the version controlled flow has no parameter-context assigned, if you were to commit that change on dev, the version-controlled flow would get updated to reference the parameter-context assigned in Prod.  So back on dev system a local change will show. Changing version to that new version will show the prod parameter-context now.  And only way to revert this is by changing version on Dev back to an older version where no parameter-context was associated to dev process group.  Then commit the needed change on DEV instead of Prod.

This feels like maybe an area for product improvement.  I am thinking along the lines of a checkbox on start version control or commit local that asks if parameter-context should be sent in change request.  (Already parameter context changes are not sent if the version-controlled flow already has a parameter-context associated with it).  This would allow you to choose not to include a parameter context with new version controlled dataflow (default checked) or not include new parameter context on commit local changes (default unchecked).  

So you would need to be careful that only dataflow configuration changes to are made on dev to this reusable version controlled flow definition.  If you need to make deployment specific change on Prod, you would need to stop version control first, make the change and commit that as new unique version controlled process group.

 

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Master Mentor

@Green_ 

Thinking more about challenges mentioned in my previous response.

You could avoid them by creating a parameter-context template on Dev.  This would be a parameter context with all the keys but no assigned values.

Then when you import the flow to prod from dev you can uncheck the box for "Keep Existing Parameter Contexts" so that a new unique named parameter context is created each time you import the flow.  Then you can update that newly generated parameter context with a flow specific name and slow specific values assigned to those parameters that currently have no values.

Back on dev, if you make a change involving a newly introduced parameter key, simply update the parameter-context-template with the new key without an assigned value.  
Now when you change version in dev, you'll get the new key that you just need to assign prod specific flow value to.

 

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt

avatar
Expert Contributor

@MattWho 
Wow! I think this pattern would work best for my usecase. I hadn't even considered the first challenge you brought up of production flows having their Parameter Context unassigned if I were to update their version. That would've been painful to find out after deploying many instances.

 

Back in NiFi 1 I used to handle situations such as this with variables, since they could just be directly attached to Process Groups and so I never had to worry about creating separate objects (parameters) and ensuring they get attached, or that every new instance of a versioned flow had to have its own unique context created. It's been a couple years but I believe I even questioned Pierre about this in one of his appearances in the Israeli NiFi meet-ups.

 

In regards to product work, I've ran into this case of trying to use NiFi as the underlying tool for different SaaS platforms multiple times already. There could definitely be some QoL changes made to make such a use-case easier to implement with NiFi's flow registry, I guess the responsibility lies in people like me opening issues to bring them though 🙂

 

Thank you very much for the suggestions Matt!

Green