Support Questions
Find answers, ask questions, and share your expertise

NiFi Registry Flow Persistence Provider switch from File System to GIT - Unable to carry existing buckets to GIT ?

Solved Go to solution

NiFi Registry Flow Persistence Provider switch from File System to GIT - Unable to carry existing buckets to GIT ?

Rising Star

Hello experts,

 

I have nifi registry with MySQL db as metadata store and FileSystem as Persistence provider.

And this registry has 10+ flows with multiple versions.

 

Now I am trying to integrate registry to GIT, that is trying to change persistence provider from FileSystem to GIT.

 

I referred some docs (below urls) and steps to switch persistence provider etc, those all help for fresh GIT integration set and they wont carry/migrate already created buckets/flows/versions from FileSystem to GIT

 

https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#switching-from-other-...

 

https://community.cloudera.com/t5/Community-Articles/Storing-Apache-NiFi-Versioned-Flows-in-a-Git-Re...

 

Any help on this would be much appreciated.

 

Thanks

Mahendra

1 ACCEPTED SOLUTION

Accepted Solutions

Re: NiFi Registry Flow Persistence Provider switch from File System to GIT - Unable to carry existing buckets to GIT ?

Master Guru

@hegdemahendra 

The NiFi CLI toolkit [1] can help here to an extent.

This toolkit provides the following NiFi-Registry capabilities:

registry current-user
registry list-buckets
registry create-bucket
registry delete-bucket
registry list-flows
registry create-flow
registry delete-flow
registry list-flow-versions
registry export-flow-version
registry import-flow-version
registry sync-flow-versions
registry transfer-flow-version
registry diff-flow-versions
registry upload-bundle
registry upload-bundles
registry list-bundle-groups
registry list-bundle-artifacts
registry list-bundle-versions
registry download-bundle
registry get-bundle-checksum
registry list-extension-tags
registry list-extensions
registry list-users
registry create-user
registry update-user
registry list-user-groups
registry create-user-group
registry update-user-group
registry get-policy
registry update-policy
registry update-bucket-policy

You can get a description of each by executing:

 

<path to>/cli.sh registry sync-flow-versions -h

 

Since you are changing FlowPersistence providers and not trying to sync flows to a new NiFi-Registry, You really can't use the above "sync-flow-versions" function. Plus, I really don't see it even in that scenario being able to accomplish your goal because you would end up with new flow ids. 

When you create a bucket in NiFi-Registry it is assigned a bucket if (random uuid).
When you version control a Process Group (PG) in NiFi, you choose an existing bucket and it first creates a new flow id (Random UUID assigned to the flow).  Then the initial version 1 of that PG flow is created and assigned to that flow id in the NiFi-Registry.  Since you cannot force the flow id assigned UUID, syncing flows from registry 1 to registry 2, would not track to your version controlled PGs in your NiFI because of change in flow id.

In your scenario, you would need to export all your flows (version by version and it is important you keep rack of the version fo the flow you extract).  So for a flow with ID XYZ you may have 6 versions.  This means you would use:

registry export-flow-version

I'd suggest naming the produced json file using source flow id and flow version like XYZ_v1.json, XYZ_v2.json, etc...
Example:

 

./cli.sh registry export-flow-version -ot json -u http://<nifi-registry hostname>:<port>/ -f c97fd570-e2ef-4001-98c9-8810244b6015 -fv 1 -o /tmp/c97fd570-e2ef-4001-98c9-8810244b6015_ver1.json

 


You should then save off your original DB.  
Delete all existing flows so all you have are your original buckets
Then you would need to take all these exported flows and import them back in to registry after switching to your new persistence provider.  Now keep in mind before importing each flow version you must first create a new flow within the correct still existing buckets. Keep track of these newly assigned flow ids and which original flow id you are importing in to them (very important)
Then you MUST import each new flow in exact version 1 to version x order.  If you import version 5 of flow XYZ first it will become version 1 within that new flow Id.   The version persisted in the output json is not used when importing, it is assigned the next incremental version in the new flow id.
Once you are done here you have a bunch of new flow ids with all your versions imported.

Now you need to go edit your flow.xml.gz in NiFi.
For every version controlled PG in that flow.xml.gz you will find a section that looks like this:

 

        <versionControlInformation>
          <registryId>912e8161-0176-1000-ffff-ffff98135aca</registryId>
          <bucketId>0cab84ff-399b-4113-9767-687e8e33e48a</bucketId>
          <bucketName>bucket-name</bucketName>
          <flowId>136b3ba8-bc6f-46dd-afe5-235a80ef8cfe</flowId>
          <flowName>flow-name</flowName>
          <flowDescription/>
          <version>5</version>
        </versionControlInformation>

 

Everything here should remain the same except fro the change in "flowId"
This would allow you to do a global search and replace on "<flowId>original id</flowId>" to "<flowId>new id</flowId>".

Make sure you stop all NiFi nodes,  put same modified flow.xml.gz on all nodes (backup original), and start NiFi nodes again.  Your PGs should now be tracking to your new flows imported in your registry now backed by the gitFlowPersistenceProvider.


[1] https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#nifi_CLI

 

Sorry there is no automated path for this.  
If you found this addressed your query, please take a moment to login and click "Accept" on those solutions which assisted you.
Thanks,

Matt

View solution in original post

2 REPLIES 2

Re: NiFi Registry Flow Persistence Provider switch from File System to GIT - Unable to carry existing buckets to GIT ?

Super Guru

it's only for new saves.

 

you probably need to export those with NiFi Cli and reimport after adding git and restart

Re: NiFi Registry Flow Persistence Provider switch from File System to GIT - Unable to carry existing buckets to GIT ?

Master Guru

@hegdemahendra 

The NiFi CLI toolkit [1] can help here to an extent.

This toolkit provides the following NiFi-Registry capabilities:

registry current-user
registry list-buckets
registry create-bucket
registry delete-bucket
registry list-flows
registry create-flow
registry delete-flow
registry list-flow-versions
registry export-flow-version
registry import-flow-version
registry sync-flow-versions
registry transfer-flow-version
registry diff-flow-versions
registry upload-bundle
registry upload-bundles
registry list-bundle-groups
registry list-bundle-artifacts
registry list-bundle-versions
registry download-bundle
registry get-bundle-checksum
registry list-extension-tags
registry list-extensions
registry list-users
registry create-user
registry update-user
registry list-user-groups
registry create-user-group
registry update-user-group
registry get-policy
registry update-policy
registry update-bucket-policy

You can get a description of each by executing:

 

<path to>/cli.sh registry sync-flow-versions -h

 

Since you are changing FlowPersistence providers and not trying to sync flows to a new NiFi-Registry, You really can't use the above "sync-flow-versions" function. Plus, I really don't see it even in that scenario being able to accomplish your goal because you would end up with new flow ids. 

When you create a bucket in NiFi-Registry it is assigned a bucket if (random uuid).
When you version control a Process Group (PG) in NiFi, you choose an existing bucket and it first creates a new flow id (Random UUID assigned to the flow).  Then the initial version 1 of that PG flow is created and assigned to that flow id in the NiFi-Registry.  Since you cannot force the flow id assigned UUID, syncing flows from registry 1 to registry 2, would not track to your version controlled PGs in your NiFI because of change in flow id.

In your scenario, you would need to export all your flows (version by version and it is important you keep rack of the version fo the flow you extract).  So for a flow with ID XYZ you may have 6 versions.  This means you would use:

registry export-flow-version

I'd suggest naming the produced json file using source flow id and flow version like XYZ_v1.json, XYZ_v2.json, etc...
Example:

 

./cli.sh registry export-flow-version -ot json -u http://<nifi-registry hostname>:<port>/ -f c97fd570-e2ef-4001-98c9-8810244b6015 -fv 1 -o /tmp/c97fd570-e2ef-4001-98c9-8810244b6015_ver1.json

 


You should then save off your original DB.  
Delete all existing flows so all you have are your original buckets
Then you would need to take all these exported flows and import them back in to registry after switching to your new persistence provider.  Now keep in mind before importing each flow version you must first create a new flow within the correct still existing buckets. Keep track of these newly assigned flow ids and which original flow id you are importing in to them (very important)
Then you MUST import each new flow in exact version 1 to version x order.  If you import version 5 of flow XYZ first it will become version 1 within that new flow Id.   The version persisted in the output json is not used when importing, it is assigned the next incremental version in the new flow id.
Once you are done here you have a bunch of new flow ids with all your versions imported.

Now you need to go edit your flow.xml.gz in NiFi.
For every version controlled PG in that flow.xml.gz you will find a section that looks like this:

 

        <versionControlInformation>
          <registryId>912e8161-0176-1000-ffff-ffff98135aca</registryId>
          <bucketId>0cab84ff-399b-4113-9767-687e8e33e48a</bucketId>
          <bucketName>bucket-name</bucketName>
          <flowId>136b3ba8-bc6f-46dd-afe5-235a80ef8cfe</flowId>
          <flowName>flow-name</flowName>
          <flowDescription/>
          <version>5</version>
        </versionControlInformation>

 

Everything here should remain the same except fro the change in "flowId"
This would allow you to do a global search and replace on "<flowId>original id</flowId>" to "<flowId>new id</flowId>".

Make sure you stop all NiFi nodes,  put same modified flow.xml.gz on all nodes (backup original), and start NiFi nodes again.  Your PGs should now be tracking to your new flows imported in your registry now backed by the gitFlowPersistenceProvider.


[1] https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#nifi_CLI

 

Sorry there is no automated path for this.  
If you found this addressed your query, please take a moment to login and click "Accept" on those solutions which assisted you.
Thanks,

Matt

View solution in original post