Support Questions

Find answers, ask questions, and share your expertise

How can we update NIFI workflow in production

avatar
Expert Contributor

Hi guys,

We are preparing NiFi in production.

One questions we have is how to make workflow component changes in production. e.g. add one more sink or source, replace one component in the workflow, etc.

After reading some articles, below is what in my mind.

1. Develop and test new templates

2. Develop a new workflow by combining several templates and test in QA

3. Undeploy the old workflow from production

4. Deploy the new workflow in production.

Is it the best practice?

We are confused on what should be done by developer v.s NiFi operators.

For a small change, e.g. change a config property, should it be done by developer or NiFi operator?

Another question is:

Between "Undeploy the old one" and "Deploy the new workflow", there is a small time window.

For real-time data steaming, how do we avoid data loss in this small time window.

Any suggestions and comments are appreciated.

Thanks.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

It looks like you have the following three questions:

1. How do I make a change to Flow Management in production?

The steps you've outlined for this question look to be following best practices. Please see the summary of documented best practice workflow below, and reference this Knowledge Base article for full details: https://community.hortonworks.com/articles/60868/enterprise-nifi-implementing-reusable-components-a....

  1. Reusable components are added as templates to a central repository. This should be governed. Reusable components are probably best represented as process groups. This makes building new flows simpler by separating the reuse components (and encapsulation of details) from the new flow components.
  2. Development groups pull reusable components and upload to their NiFi environment to build new flows.

    In flow configurations, sensitive values should be configured as Expression Language references to OS environment variables that you set in each environment, e.g. ${MYSQL_PSSWRD}.

    Other environment-specific config values should similarly use Expression Language references. If these are not sensitive, should be in custom properties file.

  3. Developers finalize their flows and submit the template of the flow to version control, eg Git (and also submit custom property files).
  4. Template and custom property files are promoted to each environment just as source code typically is.
  5. Automation: deploying templates to environments can be done via NiFi RestAPI integrated with other automation tools.
  6. Governance bodies decide which configurations can be changed in real-time (e.g. ControlRate properties). These changes do not need to go through verision control and can be made by authorized admins on the fly. For authorization policies, see: https://community.hortonworks.com/articles/60842/hdf-20-defining-nifi-policies-in-ranger.html

There is additionally a NiFi REST API you can use to perform a smaller set of operations such as start/stop processors, etc. https://nifi.apache.org/docs/nifi-docs/rest-api/

2. Who makes a config change in production?

This is ultimately up to you and your team to decide, but through Apache Ranger you can get processor-level granularity in the permissions you give to your users. So for example, you can assign read+write permissions to a set of users for the most commonly modified processors but give read-only access for those processors you do not expect to change. Please see this Nifi+Ranger security article for more information: https://community.hortonworks.com/articles/60842/hdf-20-defining-nifi-policies-in-ranger.html

3. How do I ensure data is not lost when changing workflows in production?

This is automatically handled by Apache NiFi. There is a queue created before each processor, so every processor you create is buffered. This will do the following things:

1. When the processor is up and running, the data will pass into the queue. If the processor is finished processing the previous input data, it will take the next piece of data off of the queue and process that. If your data is coming in at a faster rate than your processor is able to process, it will be temporarily stored in the queue.

2. When you press the 'Stop' (square) button while having a processor selected, you will stop the processing of data through that processor and all data that is intended to go to that processor will be queued up in the queue immediately preceding the processor. You can then re-assign the queue's output arrow to point to a different processor in the case of a production workflow change.

3. When a processor is turned back on using the 'Play' (triangle) button, it will process everything that was stored in the queue that precedes the processor. If there were 20,000 objects stored in the queue while you made a change to the processor, it will process all of those objects in turn and send them downstream.

For a complete guide, please see the Apache NiFi User Guide.

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#command-and-control-of-the-dataflow

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

It looks like you have the following three questions:

1. How do I make a change to Flow Management in production?

The steps you've outlined for this question look to be following best practices. Please see the summary of documented best practice workflow below, and reference this Knowledge Base article for full details: https://community.hortonworks.com/articles/60868/enterprise-nifi-implementing-reusable-components-a....

  1. Reusable components are added as templates to a central repository. This should be governed. Reusable components are probably best represented as process groups. This makes building new flows simpler by separating the reuse components (and encapsulation of details) from the new flow components.
  2. Development groups pull reusable components and upload to their NiFi environment to build new flows.

    In flow configurations, sensitive values should be configured as Expression Language references to OS environment variables that you set in each environment, e.g. ${MYSQL_PSSWRD}.

    Other environment-specific config values should similarly use Expression Language references. If these are not sensitive, should be in custom properties file.

  3. Developers finalize their flows and submit the template of the flow to version control, eg Git (and also submit custom property files).
  4. Template and custom property files are promoted to each environment just as source code typically is.
  5. Automation: deploying templates to environments can be done via NiFi RestAPI integrated with other automation tools.
  6. Governance bodies decide which configurations can be changed in real-time (e.g. ControlRate properties). These changes do not need to go through verision control and can be made by authorized admins on the fly. For authorization policies, see: https://community.hortonworks.com/articles/60842/hdf-20-defining-nifi-policies-in-ranger.html

There is additionally a NiFi REST API you can use to perform a smaller set of operations such as start/stop processors, etc. https://nifi.apache.org/docs/nifi-docs/rest-api/

2. Who makes a config change in production?

This is ultimately up to you and your team to decide, but through Apache Ranger you can get processor-level granularity in the permissions you give to your users. So for example, you can assign read+write permissions to a set of users for the most commonly modified processors but give read-only access for those processors you do not expect to change. Please see this Nifi+Ranger security article for more information: https://community.hortonworks.com/articles/60842/hdf-20-defining-nifi-policies-in-ranger.html

3. How do I ensure data is not lost when changing workflows in production?

This is automatically handled by Apache NiFi. There is a queue created before each processor, so every processor you create is buffered. This will do the following things:

1. When the processor is up and running, the data will pass into the queue. If the processor is finished processing the previous input data, it will take the next piece of data off of the queue and process that. If your data is coming in at a faster rate than your processor is able to process, it will be temporarily stored in the queue.

2. When you press the 'Stop' (square) button while having a processor selected, you will stop the processing of data through that processor and all data that is intended to go to that processor will be queued up in the queue immediately preceding the processor. You can then re-assign the queue's output arrow to point to a different processor in the case of a production workflow change.

3. When a processor is turned back on using the 'Play' (triangle) button, it will process everything that was stored in the queue that precedes the processor. If there were 20,000 objects stored in the queue while you made a change to the processor, it will process all of those objects in turn and send them downstream.

For a complete guide, please see the Apache NiFi User Guide.

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#command-and-control-of-the-dataflow

avatar
Expert Contributor

Hi @anarasimham

Thank you very much for your answer. It answers most of my questions!

A couple of questions are still not clear to me.

1. Data loss during flow redeployment.

You mentioned how is data buffered in queue between two processors during processor updating/stopped.

How does the workflow as a whole redeployment(stopped/updating) affect data loss?

For example, I am running a flow f1 in NiFi.

I am going to deploy f2 to replace f1 for the same input source and business logic. f2 has a few more processors than f1. So I undeploy(stop and delete) f1, then deploy f2.

There is a time window between f1 is stopped and f2 is started.

Since the data is streaming in, and there is no queue as buffer(the entire flow is stopped), how do I avoid data loss?

Or maybe this is not the right way to do workflow updating?

BTW, we are not supposed to click buttons on WebUI in production. I understand it is much easier to work with flow update on Web UI in production. But we can't directly change flows in production. We have to make flow changes in Dev env and test it. Then, deploy the new flow via CI/CD in an automated way by using NiFi REST API or tools.

2. Authorization and Access Policy

You mentioned Apache Ranger to work with NiFi on authorization.

I noticed there are some conf files: authorizers.xml, authorizations.xml, users.xml for authorization setup.

Is it enough for production?

Do we have to use Apache Ranger since this introduces one more service in our platform?

Thanks.

avatar
Super Collaborator

1. Two points to make regarding this question:

-If the machines that stream data to your cluster's NiFi instance are using NiFi/MiNiFi, this will be no problem as those agent services have built-in buffers as well to hold data while you upgrade and restart your cluster's NiFi dataflow.

-If those machines are using a non-buffered flow management service, you will need to modify your data architecture to include a buffer before the data reaches NiFi. One way to do this is to insert a Kafka messaging queue before your cluster's NiFi consumes the data. That way even if NiFi goes down, you will have a buffer to hold the data. You may explore alternative architectures as well for this.

2. Ranger, Knox, Atlas, and others are a part of our security suite and are fairly lightweight relative to our streaming and flow management services so yes, while you have to use Ranger for your security needs, it should not get in your way as far as having a significant performance impact. They are excellent solutions for providing security, data lineage/provenance, user authentication/authorization, and perimeter security so they are worth their footprint.

There are several different configuration options you'll need to take a look into to configure your system for Production and yes those configuration files are a part of it.

Here is some documentation on how you can use those files and what else you'll need to consider for your security setup:

File details: https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.0/bk_administration/content/multi-tenant-auth...

NiFi Administration overview: https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.0/bk_administration/content/authorizers-setup...

avatar
Expert Contributor

Awesome! Thanks.