Member since
04-05-2016
139
Posts
144
Kudos Received
16
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 50666 | 02-14-2019 02:53 PM |
02-20-2019
04:13 PM
Hi @Nitin Damle It would be helpful if you selected "Reply" to each of my comments instead of adding new Answers, which makes the readability of this question more difficult. Understandably, the "Reply" link is very small and hard to see. Are you modifying the template prior to uploading it into NiFi? That was the issue in the other HCC article you referenced (https://community.hortonworks.com/questions/226390/nifi-18-error-updateattribute-is-not-known-to-this.html). Because the template was malformed, it gave that error.
... View more
02-15-2019
03:15 PM
1 Kudo
@Nitin Damle convert-cvs-to-xml.xml If you are still having problems uploading the template, can you confirm what version of NiFi you are on? I just did a test in NiFi 1.8.0 and the upload was successful.
... View more
02-14-2019
02:53 PM
1 Kudo
Hi @Nitin Damle You can use the ConvertRecord processor (with XMLReader/CSVRecordSetWriter controller services). I have an article here that shows CSV to XML (with CSVReader/XMLRecordSetWriter controller services), so you just need to do the inverse of that record transformation: https://community.hortonworks.com/content/kbentry/199310/xml-record-writer-in-apache-nifi-170.html -Andrew
... View more
10-24-2018
03:15 PM
2 Kudos
Objective
In a separate HCC article, I detailed how to offload nodes using the NiFi UI. This article covers how to perform the same operation using the NiFi Toolkit CLI.
Note: This tutorial assumes you have already setup a running NiFi cluster. The examples included are for a 2 node NiFi cluster. Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6 Apache NiFi 1.8.0 Apace NiFi Toolkit 1.8.0 Offload Nodes via CLI Tutorial Queue FlowFiles
First, generate some queued flowfiles:
The queued flowfiles are distributed between both nodes in the cluster:
Disconnect Node
From a terminal window, navigate to the directory where the NiFi Toolkit was installed. From the /nifi-toolkit-1.8.0/bin directory, run:
./cli.sh
Enter the command
nifi get-nodes to retrieve the node IDs:
#> nifi get-nodes
# Node ID Node Address API Port Node Status
- ------------------------------------ ------------ -------- -----------
0 dfa62636-479b-4456-aa16-3bfc19f35cb5 localhost 9443 CONNECTED
1 20a68479-e8dc-46c9-8612-872200e2fdab localhost 9444 CONNECTED
Note: If you see the error "ERROR: Error executing command 'get-nodes' : Error retrieving node status: Unable to view the controller. Contact the system administrator." You need to add "CN=localhost, OU=NIFI" to the "access the controller" view policy.
Use the command
nifi disconnect-node to disconnect the node on Port 9444:
#> nifi disconnect-node -nnid 20a68479-e8dc-46c9-8612-872200e2fdab
Node ID: 20a68479-e8dc-46c9-8612-872200e2fdab
Node Address: localhost
API Port: 9444
Node Status:DISCONNECTED~
The node is disconnected:
Note: If you see the error "ERROR: Error executing command 'disconnect-node' : Error disconnecting node: Unable to modify the controller. Contact the system administrator." You need to add "CN=localhost, OU=NIFI" to both the "access the controller" modify policy.
Offload Node
Use the command
nifi offload-node to offload the flowfiles on port 9444:
#> nifi offload-node -nnid 20a68479-e8dc-46c9-8612-872200e2fdab
Node ID: 20a68479-e8dc-46c9-8612-872200e2fdab
Node Address: localhost
API Port: 9444
Node Status:OFFLOADING~
Repeating the command returns the current status of the node:
#> nifi offload-node -nnid 20a68479-e8dc-46c9-8612-872200e2fdab
Node ID: 20a68479-e8dc-46c9-8612-872200e2fdab
Node Address: localhost
API Port: 9444
Node Status:OFFLOADED~
In the UI, it is confirmed that all the flowfiles have been offloaded to the active node:
Delete & Decommission Node
An offloaded node can be either connected back to the cluster (
nifi connect-node ) or deleted. If the goal is to to decommission the node, use the
nifi delete-node command:
#> nifi delete-node -nnid 20a68479-e8dc-46c9-8612-872200e2fdab
OK
Then after the node is deleted, stop/remove the NiFi service on the host.
... View more
Labels:
10-24-2018
02:56 PM
5 Kudos
Objective
With the release of NiFi 1.8.0, flowfiles that remain on a disconnected node can be rebalanced to other active nodes in the cluster via offloading.
Note: This tutorial assumes you have already setup a running NiFi cluster. The examples included are for a 2 node NiFi cluster. Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6 Apache NiFi 1.8.0 Offload Nodes Tutorial Queue FlowFiles
First, generate some queued flowfiles:
In the Global Menu at the top left, select "Cluster" to see the Cluster Management dialog:
The queued flowfiles are distributed between both nodes in the cluster: Disconnect Node
Select Disconnect for one of the nodes:
The node is disconnected: Offload Node
Select Offload ( ) on the disconnected node. This will stop and terminate all processors and rebalance flowfiles to the other connected nodes in the cluster:
Note: Offload would also stop transmitting on all remote process groups if they were in the flow.
When offloading is finished, all of the queued flowfiles are now on the active node: Delete & Decommission Node
An offloaded node can be either connected back to the cluster or deleted. Select Delete:
Once deleted, the node cannot be rejoined to the cluster until it has been restarted. To decommission the node, stop/remove the NiFi service on the host.
... View more
Labels:
09-24-2018
07:31 PM
If the Localhost user is not configured properly in the Registry, here is an error you will see in NiFi when trying to start version control on a process group:
... View more
08-09-2018
04:10 PM
10 Kudos
Objective
With the release of NiFi Registry 0.2.0, flow contents can now be stored under a Git directory using the new
GitFlowPersistenceProvider .
This tutorial walks you through how to configure this provider in NiFi Registry so that versioned flows in NiFi are automatically saved to a Git repository.
A video version of this tutorial can be seen here: https://youtu.be/kK7eVppg9Aw
Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6
Apache NiFi Registry 0.2.0
Apache NiFi 1.7.1
GitFlowPersistenceProvider
Git Configuration
First, create a new GitHub repo:
then clone it locally using the
git clone command (e.g. git clone https://github.com/andrewmlim/versioned_flows.git ):
Next, go to GitHub’s “Developer settings” and create a new “Personal access token”:
NiFi Registry Configuration
In the
./conf/providers.xml file, configure the following properties:
Set org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider as the qualified class name
Set “Flow Storage Directory” to the directory where the repo was cloned
Set "Remote to Push" to origin
Set “Remote Access User” to your GitHub username
Set “Remote Access Password” to the personal access token
Here is an example of these changes in
providers.xml:
<flowPersistenceProvider>
<class>org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider</class>
<property name="Flow Storage Directory">./versioned_flows</property>
<property name="Remote To Push">origin</property>
<property name="Remote Access User">andrewmlim</property>
<property name="Remote Access Password">f1295e16f933d4468d948ea276372da8b0585bda</property>
</flowPersistenceProvider>
Note: The “Remote To Push” property specifies the name of the remote to automatically push to. This property is optional and if not specified, commits will remain in the local repository unless a push is performed manually.
Saving a Versioned Flow to the Git Repo
Start up NiFi Registry and create a bucket:
Start up a NiFi instance and connect to the Registry:
Create a process group. Start version control:
Save the flow:
In GitHub, you will see that the Bucket and Flow have been saved in your repo:
As shown, Buckets are represented as directories and Flow contents are stored as files in the Bucket directory they belong to. Flow snapshot histories are managed as Git commits, meaning only the latest version of Buckets and Flows exist in the Git directory.
Note: The commit message states "By NiFi Registry user: anonymous" since the environment was unsecured and there was no user logged into NiFi. The commit message would have the user's identity if secure.
Helpful Links
Here are some helpful links that were used as references for this article:
Apache NiFi Registry Administration Guide - Git FlowPersistenceProvider
bryanbende.com - Apache NiFi Registry 0.2.0
... View more
Labels:
06-26-2018
02:18 PM
NiFi 1.7.0 introduced XML Record Reader/Writers. So CSV to XML conversion using a ConvertRecord processor can now be done simply with CSVReader and XMLRecordSetWriter controller services as shown here: https://community.hortonworks.com/content/kbentry/199310/xml-record-writer-in-apache-nifi-170.html
... View more
06-25-2018
06:31 PM
3 Kudos
Objective
About a year ago, I wrote an article that detailed how to use the
ConvertRecord processor and Record Reader/Writer controller services to easily convert a CVS file into various formats (JSON, Avro, XML): https://community.hortonworks.com/content/kbentry/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html.
At the time, the CSV to XML conversion was done using a ScriptedRecordSetWriter. With the release of NiFi 1.7.0, the CSV to XML conversion can be done much more simply with the new XMLRecordSetWriter. Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6 Apache NiFi 1.7.0 Convert CSV to XML Support Files
Here is a template of the flow discussed in this tutorial: convert-cvs-to-xml.xml
Here is the CSV file used in the flow: users.txt Note: Change the extension from .txt to .csv after downloading. Demo Configuration Import Template
Start NiFi. Import the provided template and add it to the canvas.
You should see the following flow:
Note: After importing the template, create or make sure the directory paths for the GetFile and PutFile processors exist, confirm users.csv is in the input directory and enable all Controller Services before running the flow: Flow Highlights
Details of the original flow are covered in my previous HCC article, but here are the key changes made:
ConvertRecord - CSVtoXML (ConvertRecord Processor)
Record Reader is still set to "CSVReader" but Record Writer is now set to the new "XMLRecordSetWriter":
XMLRecordSetWriter Controller Service
Here are the properties for this controller service: Besides the default values, Schema Access Strategy property is set to "Use 'Schema Name' Property", Schema Registry is set to AvroSchemaRegistry and Name of Root Tag is set to "record". Flow Results
Running this updated flow now produces a flowfile with XML contents:
... View more
Labels:
04-09-2018
05:56 PM
3 Kudos
Objective
To import a versioned flow or revert local changes in a versioned flow, a user must have access to all the components in the versioned flow. As such, it is recommended that restricted components are created at the root process group level if they are to be utilized in versioned flows. This tutorial illustrates the benefits of this configuration and demonstrates a new feature introduced in Apache NiFi 1.6.0: granular restricted component categories (NIFI-4885). Users can be given access to all restricted components or to specific categories of restricted components.
Note: This tutorial assumes you are familiar with setting up a secure Apache NiFi instance and integrating it with a secure Apache NiFi Registry. Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6
Apache NiFi 1.6.0
Apache NiFi Registry 0.1.0 User Setup
Assume the following: There are two users, "sys_admin" and "test_user" who have access to both view and modify the root process group. "sys_admin" has access to all restricted components.
"test_user" has access to restricted components requiring 'read filesystem' and 'write filesystem'. Restricted Controller Service Created in Root Process Group
In this first example, sys_admin creates a KeytabCredentialsService controller service (NIFI-4917) at the root process group level: KeytabCredentialService controller service is a restricted component that requires 'access keytab' permissions:
Sys_admin creates a process group ABC containing a flow with GetFile and PutHDFS processors:
GetFile processor is a restricted component that requires 'write filesystem' and 'read filesystem' permissions:
PutHDFS is a restricted component that requires 'write filesystem' permissions:
The PutHDFS processor is configured to use the root process group level KeytabCredentialsService controller service:
Sys_admin saves the process group as a versioned flow:
Test_user changes the flow by removing the KeytabCredentialsService controller service:
If test_user chooses to revert this change:
the revert is successful:
Additionally, if test_user chooses to import the ABC versioned flow:
The import is successful: Restricted Controller Service Created in Process Group
Now, consider a second scenario where the controller service is created on the process group level.
Sys_admin creates a process group XYZ:
Sys_admin creates a KeytabCredentialsService controller service at the process group level:
The same GetFile and PutHDFS flow is created in the process group:
However, PutHDFS now references the process group level controller service:
Sys_admin saves the process group as a versioned flow.
Test_user changes the flow by removing the KeytabCredentialsService controller service. However, with this configuration, if test_user attempts to revert this change:
the revert is unsuccessful because test_user does not have the 'access keytab' permissions required by the KeytabCredentialService controller service:
Similarly, if test_user tries to import the XYZ versioned flow:
The import fails:
... View more
Labels: