Created 07-04-2016 08:06 PM
What are the best practices for promoting HDF code (xmls) from Development to Production?
Created 07-04-2016 09:15 PM
There are generally two things to consider. First, the configuration of a given dataflow. Second, the code required for a given dataflow.
If your deployed artifacts in dev and prod are aligned then your focus is on configuration. For this NiFi supports templates which you can generate in dev and import into production. Templates are a great start for this but they do have the current downside in that they won't copy sensitive properties and they can be too coupled to environmental items like database URLs or web URLs. There are efforts underway in the Apache NiFi community to make them more portable via environment variable mappings which tie to a given environment and then the templates will tie to the mappings.
In the case where you also need to get new deployment artifacts into production we benefit from Apache NiFi's support for easily deployed NiFi Archives (NARs) which nicely contain the code and dependencies so it is generally as easy as moving in a new NAR bundle into the lib. Typically people will have an 'extensions' folder and place their items in there. On restart NiFi will read its configuration classpath location(s) and make that new code available. In a cluster people typically add these items to all nodes then do rolling restart of the nodes in the cluster to avoid any downtime for the flow.
Created 07-04-2016 09:15 PM
There are generally two things to consider. First, the configuration of a given dataflow. Second, the code required for a given dataflow.
If your deployed artifacts in dev and prod are aligned then your focus is on configuration. For this NiFi supports templates which you can generate in dev and import into production. Templates are a great start for this but they do have the current downside in that they won't copy sensitive properties and they can be too coupled to environmental items like database URLs or web URLs. There are efforts underway in the Apache NiFi community to make them more portable via environment variable mappings which tie to a given environment and then the templates will tie to the mappings.
In the case where you also need to get new deployment artifacts into production we benefit from Apache NiFi's support for easily deployed NiFi Archives (NARs) which nicely contain the code and dependencies so it is generally as easy as moving in a new NAR bundle into the lib. Typically people will have an 'extensions' folder and place their items in there. On restart NiFi will read its configuration classpath location(s) and make that new code available. In a cluster people typically add these items to all nodes then do rolling restart of the nodes in the cluster to avoid any downtime for the flow.
Created 07-05-2016 03:20 PM
Thank you for answering.
I was thinking of using Nifi templates from dev and then pass it through a perl/shell/awk script which changes the properties before promoting it to production. But what you are saying that it does not copy some of the the sensitive properties. So seems like this approach will be unnecessarily complex. Is there any way to guarantee that the template will copy all the properties (Even blank values will do)? What are your thoughts? Also is there a list of properties which it omits?
Created 07-05-2016 03:28 PM
One way of doing this is to push templates created in a dev instance onto a production instance of NiFi. This would usually be done through scripted API calls. NiFi deliberately avoids including sensitive properties like passwords and connection strings in the template files. However, given that these are likely to change in a production environment anyway, this is more a benefit than a drawback. A good way on handling this is to use the API again to populate production properties in the template once deployed.
A good starting point for this would be to take a look at https://github.com/aperepel/nifi-api-deploy which provides a script configured with a yaml file to deploy templates and then update properties in a production instance. This will obviously be a lot cleaner once the community has completed the variable registry effort, but will provide you a good solution for now.
As Joe points out, it is also important to ensure you copy up any custom processors you have in nar bundles as well, but that's just file copy and restart (and should be kept in a custom folder as joe suggests to make upgrades easier).