Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Best practices on Apache Nifi development guides when multiple team working on DataFlow

Highlighted

Best practices on Apache Nifi development guides when multiple team working on DataFlow

New Contributor

Me and my team has below set of question related to Apache Nifi.

The best practice to organize team development of the nifi workflow that includes:

-Flow source control (merge, diff, …) - should it be done for template or for the full flow ?

-Sharing templates question : through the source control or through the shared nifi instance ?

-Controller Services are duplicating during template export – is it a bug or the feature that we don’t understand how to use?

How to deploy the configured flow on the nifi cluster? Through the template import or full flow substitution with nifi restart?

5 REPLIES 5
Highlighted

Re: Best practices on Apache Nifi development guides when multiple team working on DataFlow

Guru

@sachin tiwari This below post does not answer all of your questions but it answers some and addresses the spirit of the perspective you are seeing this from (team dev / SDLC):

https://community.hortonworks.com/articles/60868/enterprise-nifi-implementing-reusable-components-a....

Highlighted

Re: Best practices on Apache Nifi development guides when multiple team working on DataFlow

New Contributor

Based on the details shared on above link, I was thinking of doing below steps.

  • 1.First create the structure of Nifi workflow i.e. organize the workflow based on the client requirement. Example RootPG contain PG1 and PG2. PG1 contains PG1.1 and PG1.2. PG2 contain process group PG2.1.
  • 2.Export each process group as template and commit it in source control i.e BitBucket or GitHub.
  • 3.PG1.1, PG1.2 and PG2.1, each will contain the actual flows and we can have 1 developer working on each flow in respective process group i.e. PG1.1 -> Dev1, PG1.2-> Dev2 and PG2.1 -> Dev3.
  • 4.Say Dev2 did some update in flow under PG1.2. Dev2 user should then export each process group i.e. PG1.2, PG1, RootPG as template and commit it back to source control.
  • 5.Say at the same time Dev1 made some update in flow present in PG1.1. Dev1 user should then export each process group i.e. PG1.1, PG1, RootPG as template, perform the merge for PG1 and RootPG as it has Dev2 user changes and then commit it back.
  • 6.Dev3 has also made some change in flow present under PG2.1. Dev3 user should then export each process group i.e. PG2.1, PG2 and RootPG as template, perform the merge for RootPG as it has Dev1 and Dev2 user change and then commit it.

The main idea behind creating multiple template is to have a capability to deploy specific template (process group) as compared to entire flow in any environment. Will it be a good idea as I can see that with this approach, no of templates tends to increase and merging will be difficult.

Highlighted

Re: Best practices on Apache Nifi development guides when multiple team working on DataFlow

Guru

@sachin tiwari

I think it all depends on what the unit of deployment is.

If:

  • the process groups at whatever level (PG1&PG2, or PG1.1&PG&PG2.1) are independent flows and can be deployed separately -- then they should be code-managed as separate process groups (no merging as member of parent). Process groups should be decoupled in this case.
  • the process groups must be deployed together -- then the parent process group should be change-managed with their children. This is your model.

It is pretty much like a software code base: what needs to be deployed together? That unit of deployment is branched by separated teams, changed and merged among all teams for deployment, using standard change-management models.

Highlighted

Re: Best practices on Apache Nifi development guides when multiple team working on DataFlow

New Contributor

Thanks Greg!!

I am still not clear on how can I get my multiple developer work on same Nifi flow. Shall I be running a single centrally hosted Nifi, and ask my team to work on it so that my flow.xml will always have latest changes and commit this flow.xml on a daily basis. But with this I will not have flexibility of deploying individual component. Any thoughts on this?

Highlighted

Re: Best practices on Apache Nifi development guides when multiple team working on DataFlow

Super Guru
Don't have an account?
Coming from Hortonworks? Activate your account here