Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Guru

Objective

With the release of NiFi Registry 0.2.0, flow contents can now be stored under a Git directory using the new GitFlowPersistenceProvider.

This tutorial walks you through how to configure this provider in NiFi Registry so that versioned flows in NiFi are automatically saved to a Git repository.

A video version of this tutorial can be seen here: https://youtu.be/kK7eVppg9Aw

Environment

This tutorial was tested using the following environment and components:

  • Mac OS X 10.11.6
  • Apache NiFi Registry 0.2.0
  • Apache NiFi 1.7.1

GitFlowPersistenceProvider

Git Configuration

First, create a new GitHub repo:

85586-1-creategitrepo.png

then clone it locally using the git clone command (e.g. git clone https://github.com/andrewmlim/versioned_flows.git):

85587-2-gitclonerepo.png

Next, go to GitHub’s “Developer settings” and create a new “Personal access token”:

85588-3-createpersonalaccesstoken.png

85589-4-remoteaccesspassword.png

NiFi Registry Configuration

In the ./conf/providers.xml file, configure the following properties:

  • Set org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider as the qualified class name
  • Set “Flow Storage Directory” to the directory where the repo was cloned
  • Set "Remote to Push" to origin
  • Set “Remote Access User” to your GitHub username
  • Set “Remote Access Password” to the personal access token

Here is an example of these changes in providers.xml:

<flowPersistenceProvider>
 <class>org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider</class>
 <property name="Flow Storage Directory">./versioned_flows</property>
 <property name="Remote To Push">origin</property>
 <property name="Remote Access User">andrewmlim</property>
 <property name="Remote Access Password">f1295e16f933d4468d948ea276372da8b0585bda</property>
</flowPersistenceProvider>

Note: The “Remote To Push” property specifies the name of the remote to automatically push to. This property is optional and if not specified, commits will remain in the local repository unless a push is performed manually.

Saving a Versioned Flow to the Git Repo

Start up NiFi Registry and create a bucket:

85590-5-registrybucket1.png

Start up a NiFi instance and connect to the Registry:

85591-6-nifiregistryclient.png

Create a process group. Start version control:

85592-7-startversioncontrol.png

Save the flow:

85593-8-saveflow.png

In GitHub, you will see that the Bucket and Flow have been saved in your repo:

85594-9-gitrepocontents.png

As shown, Buckets are represented as directories and Flow contents are stored as files in the Bucket directory they belong to. Flow snapshot histories are managed as Git commits, meaning only the latest version of Buckets and Flows exist in the Git directory.

85595-10-gitcommit.png

Note: The commit message states "By NiFi Registry user: anonymous" since the environment was unsecured and there was no user logged into NiFi. The commit message would have the user's identity if secure.

Helpful Links

Here are some helpful links that were used as references for this article:

58,424 Views
Comments
avatar
Rising Star

Cannot do similar with docker images apache/nifi-registry:latest, because

 

  1. cannot mount folder.
  2. or cannot install got inside container

Do you have any suggestions? 

thuylevn_1-1619796601293.png

 

 

thuylevn_0-1619796566096.png

 

avatar
Explorer

I have same issue as @andP  with github token. Any help please ?

@alim 

avatar
Explorer

@thuylevn I have customized the image from base image by adding the updated providers.xml file to image.

 

this is my Dockerfile content,

FROM apache/nifi-registry:0.8.0
COPY registry/providers.xml /opt/nifi-registry/nifi-registry-current/conf
EXPOSE 18080
 
>docker build -t myimage/nifi-registry:0.8.0.1 .
avatar
Explorer

@alim Thanks for the details for git integration, its working fine. However its always pushing the changes to default branch (master) configured on git. Is there a way to specify the branch to clone and push always irrespective of main branch. We are running registry as a k8s container so it gets restarted some times and its pulling the default branch always. I have tried below property to set branch name "origin/develop" it complains and says only "origin" found in remote Please share your thoughts on this appreciate anyones help on this.

 

<flowPersistenceProvider>
<class>org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider</class>
<property name="Flow Storage Directory">./versioned_flows</property>
<property name="Remote To Push">origin/develop</property>
<property name="Remote Access User">my-gituserid</property>
<property name="Remote Access Password">my-token</property>
<property name="Remote Clone Repository">https://github.com/myrepo-name.git</property>
</flowPersistenceProvider>

 

avatar
Master Collaborator

@Ven5 

- Login to git in browser and create branch called 'develop'.

- Login to server(node) where registry is running, got to '/versioned_flows' directory

- Clone git repo with newly created specific branch 
   >git clone -b develop https://github.com/myrepo-name.git

 

Thats it. you should see your templates committing to 'develop' branch instead of default master branch.

avatar
Master Collaborator

Hello @alim ,

Thanks for the detailed explanation.

Do you know any solution for this problem - 
https://community.cloudera.com/t5/Support-Questions/NiFi-registry-commits-templates-to-git-with-defa...

 

Thanks

Mahendra

avatar
New Contributor

Hello @alim,

 

Can you please indicate where I can find the ./conf/providers.xml file ?

Also regarding this "Flow snapshot histories are managed as Git commits, meaning only the latest version of Buckets and Flows exist in the Git directory.", is there any way to save all versions of a specific flow in GIT?

 

Thank you!