Member since
Kudos Received
My Accepted Solutions
Title | Views | Posted |
2480 | 01-04-2017 02:35 PM | |
1507 | 12-16-2016 06:39 PM |
10:53 AM
@Roger Young -- You are correct once the template is downloaded you should be able to delete it and the related part of the flow. I am not sure off-hand why you are not seeing data flow form MiNiFi. I am actually using this exercise in a class I am teaching today and will be sure to test this and see what happens. What version of MiNiFi and NiFi are you using?
... View more
06:19 PM
@Raphaël MARY Hmm interesting. It appears that you can reach the host just fine if you are getting back a 401. Have you taken the approach @Matt Clarke mentioned ?
... View more
06:02 PM
@Raphaël MARY I use the Twitter processor pretty often (albeit I am not using SSL) and one thing I have often run into that produces the same error is a copy/paste issue with one of the Key's or Secrets. Can you try copying your key's and secrets to a text editor and then copy/paste them from there into NiFi. Thanks, Andrew
... View more
02:35 PM
2 Kudos
@regie canada -- I updated the processor back on December 7th ( to use NiFi 1.1.0 Which version of NiFi are you using? Is the SOAP endpoint you are using a public one?
... View more
09:30 PM
6 Kudos
Extracting NiFi Provenance Data using SiteToSiteProvenanceReportingTask Part 1
In this tutorial, we will learn how configure NiFi to send provenance data to a second NiFi instance:
Downloading NiFi
Configure Site-To-Site
Setting up the first Flow
Setting up the Provenance Reporting Instance Flow
Adding Site To Site Provenance Reporting
Starting the flow
Inspecting the Provenance data
Next Steps
NOTE: In this tutorial we are going to be taking the following shortcuts, in the spirit of understanding the concepts. Specifically we are going to:
Run two NiFi instances on the same host.
Not configure security in either NiFi instance or use a secure transport between NiFi hosts.
These are not the best practices that would be recommended in a production environment. References
For a primer on NiFi please refer to the NiFi Getting Started Guide. Downloading and Configuring NiFi
Downlaod the latest version of NiFi from here
Extract the contents of the download, we will refer to this instance as the "first NiFi instance"
Copy the exatracted contents to a new directory, we will refer to this as the "Provenance Reporting Instance" Configuring the Provenance Reporting Instance NiFi
Before starting up the this NiFi instane we need to enable Site-to-Site communication so that it can receive the provenance data and also change the listening port for NiFi so ti does not conflict with the first instance. To do that do the following:
Starting up the both instances of NiFi
We now have the two NiFi instances ready, to start them do the following:
Navigate to the directory for the first NiFi instance and start it according to your operting system
Navigate to the directory for the Provenance Reporting Instance and start it according to your operting system Setting up the first Flow for NiFi
Now that we have two NiFi instances up and running the next thing to do is to create our data flow. To keep things simple we are going to use one of the sample NiFi Dataflow Templates. In particular we are going to use the DateConversion flow which can be downloaded from here
After downloading this template, import it, and then create an instance of it. For instructions on how to import a template please see the Template section of the NiFi user guide. After creating an instance of the template your NiFi canvas should then look similar to this:
Figure 1. DateConversion Flow
I have modified the layout on my canvas so it easily fits on the screen. Setting up the Provenance Reporting Instance Flow
Open a browser and go to the "Provenance Reporting Instance" instance:
Create an input port called "Prov Data"
Create a LogAttribute Processor
Connect the input port to the LogAttribute Processor
Start the ProvData input port
Your flow should look similar to the following:
Figure 2. Provenance Reporting Instance Flow Adding Site To Site Provenance Reporting
We are now ready to add the provenance reporting task the NiFi flow. To do this do the following:
Go to the "hamburger menu" in the top right of the UI and chose "Controller Settings"
Go the "Reporting Tasks" tab and click the icon
Chose the SiteToSiteProvenanceReportingTask
Click on the pencil icon and edit the SiteToSiteProvenanceReportingTask properties so it looks like this:
NOTE: I set the batch size to 1, this is for demo purposes only. In a production environment you would want to adjust this or leave it as the default 1000.
Adjust the settings for the SiteToSiteProvenanceReportingTask so that the run schedule is 5 seconds and not the default 5 minutes.
NOTE: Again this is for demo purposes only. In a production environment you may want to leave this as the default or adjust it accordingly.
Starting the flow
We are now all ready to start the DateConversion flow we created before. Go ahead and just click on the start button on the operate palette.
Inspecting the Provenance data
To inspect the provenance data, go the Provenance Reporting Instance instance ( With the LogAttribute processor stopped, you should see the flow files build up in the queue between the input port and the LogAttribute processor.
To view the provenenace data do the following:
Right click on the queue and chose "List queue"
Pick one of the flow files in the queue
Chose "View" to see the content, an example of a formatted provenance event looks like this:
"eventId": "07b4693a-20b1-4a4d-9dc3-37d4c8f93e59",
"eventOrdinal": 0,
"eventType": "CREATE",
"timestampMillis": 1482171900667,
"timestamp": "2016-12-19T18:25:00.667Z",
"durationMillis": -1,
"lineageStart": 1482171900657,
"componentId": "3fde726d-5cc1-4bb6-9e06-35218a9c58a8",
"componentType": "GenerateFlowFile",
"componentName": "GenerateFlowFile",
"entityId": "47160cde-d484-4292-be3d-476cd4fff1cb",
"entityType": "org.apache.nifi.flowfile.FlowFile",
"entitySize": 1024,
"updatedAttributes": {
"path": "./",
"uuid": "47160cde-d484-4292-be3d-476cd4fff1cb",
"filename": "19180888360764"
"previousAttributes": {},
"actorHostname": "",
"contentURI": "",
"previousContentURI": "",
"parentIds": [],
"childIds": [],
"platform": "nifi",
"application": "NiFi Flow"
Next Steps
Now that you have data flowing to your Provenenace Reporting NiFi instance, you can take that JSON data and send it to any number of destinations to do further analysis on it.
... View more
06:39 PM
@Funamizu Koshi Drill does support Namenode HA. You will need to copy hdfs-site.xml to <drill-home-dir>/conf on each node you are running it on and then restart the drillbit service on each of those nodes
... View more
12:18 AM
1 Kudo
Thanks for the feedback @mclark all of your suggestions should now be apparent in the content. Thanks again for the input.
... View more
09:45 PM
14 Kudos
Getting started with MiNiFi In this tutorial, we will learn how configure MiNiFi to send data to NiFi:
Installing HDF Installing MiNiFi Setting up the Flow for NiFi Setting up the Flow for MiNiFi Preparing the flow for MiNiFi Configuring and starting MiNiFi Enjoying the data flow! References For a primer on HDF, you can refer to the tutorials here Tutorials and User Guides Installing HDF
If you do not have NiFi installed, please follow the instructions found here NOTE: The above installation guide is for HDF, this is the version that matches Apache MiNiFi 0.0.1. Although HDF 2.0 may work, for this exercise -- it is not recommended at this time. Installing MiNiFi Now that you have NiFi up and running it is time to download and install MiNiFi.
Open a browser. Download the MiNiFi Binaries from Apache MiNiFi Downloads Page. There are two options: tar.gz a format tailored to Linux and a zip file more compatible with Windows. If you are using a Mac either option is just fine. Figure 1. MiNiFi download page For this tutorial I have downloaded the tar.gz on a Mac as shown above in To install MiNiFi, extract the files from the compressed file to a location in which you want to run the application. I have chosen to install it to /Users/apsaltis/minifi-to-nifi The image below show MiNiFi downloaded and installed in this directory: Setting up the Flow for NiFi NOTE: Before starting NiFi we need to enable Site-to-Site communication. To do that do the following:
Open <$NIFI_INSTALL_DIR>/conf/ in your favorite editor Change: nifi.remote.input.socket.port= To <-- This is only being done for this exercise as MiNiFi and NiFi are running on the same host. This is not a recommended way of deploying the two products. nifi.remote.input.socket.port=10000 <-- This implies we are only using HTTP and are not securing the communication between MiNiFi and NiFi. For this exercise that is OK, however, it is important to consider your security needs when deploying these technologies.
Restart NiFi if it was running Now that we have NiFi up and running and MiNiFi installed and ready to go, the next thing to do is to create our data flow. To do that we are going to first start with creating the flow in NiFi. Remember if you do not have NiFi running execute the following command: <$NIFI_INSTALL_DIR>/bin/ start Now we should be ready to create our flow. To do this do the following:
Open a browser and go to: http://\:\/nifi on my machine that url looks is and going to it in the browser looks like this:
Figure 2. Empty NiFi Canvas The first thing we are going to do is setup an Input Port. This is the port that MiNiFi will be sending data to. To do this drag the Input Port icon to the canvas and call it "From MiNiFi" as show below in figure 3.
Figure 3. Adding the Input Port Now that the Input Port is configured we need to have somewhere for the data to go once we receive it. In this case we will keep it very simple and just log the attributes. To do this drag the Processor icon to the canvas and choose the LogAttribute processor as shown below in figure 4.
Figure 4. Adding the LogAttribute processor Now that we have the input port and the processor to handle our data, we need to connect them. After creating the connection your data flow should look like figure 5 below.
Figure 5. NiFi Flow We are now ready to build the MiNiFi side of the flow. To do this do the following:
Add a GenerateFlowFile processor to the canvas (don't forget to configure the properties on it) Add a Remote Processor Group to the canvas as shown below in Figure 6 Figure 6. Adding the Remote Processor Group
For the URL copy and paste the URL for the NiFi UI from your browser Connect the GenerateFlowFile to the Remote Process Group as shown below in figure 7. (You may have to refresh the Remote Processor Group, before the input port will be available) Figure 7. Adding GenerateFlowFile Connection to Remote Processor Group Your canvas should now look similar to what is shown below in figure 8. Figure 8. Adding GenerateFlowFile Connection to Remote Processor Group There is one last step we need to take before we can export the template. We need to make sure that we set the back pressure between the GenerateFlowFile processor and the Remote Process Group (RPG). That way if you stop NiFi and not MiNiFi you will not fill-up the hard drive where MiNiFi is running. To set the back pressure do the following: Right-click on the "From MiNiFi" connection and choose "Configure" Choose the "Settings" tab Set the "Back pressure object threshold" and "Back pressure data size threshold" to 10000 and 1 GB respectively. The next step is to generate the flow we need for MiNiFi. To do this do the following steps:
Create a template for MiNiFi illustrated below in figure 9.
Figure 9. Creating a template Select the GenerateFlowFile and the NiFi Flow Remote Processor Group (these are the only things needed for MiMiFi) Select the "Create Template" button from the toolbar Choose a name for your template We now need to save our template, as illustrated below in figure 10.
Figure 10. Template button Now we need to download the template as shown below in figure 11
Figure 11. Saving a template We are now ready to setup MiNiFi. However before doing that we need to convert the template to YAML format which MiNiFi uses. To do this we need to do the following:
Navigate to the minifi-toolkit directory (minifi-toolkit-0.0.1) Transform the template that we downloaded using the following command: bin/ transform <INPUT_TEMPLATE> <OUTPUT_FILE> For example: bin/ transform MiNiFi_Flow.xml config.yml Next copy the config.yml to the minifi-0.0.1/conf directory. That is the file that MiNiFi uses to generate the file and the flow.xml.gz for MiNiFi. That is it, we are now ready to start MiNiFi. To start MiNiFi from a command prompt execute the following: cd <MINIFI_INSTALL_DIR> bin/ start You should be able to now go to your NiFi flow and see data coming in from MiNiFi.
... View more
08:03 PM
That certainly works. It has the same effect as adding an available port to Storm so that the new topology can be run. Not sure which is cleaner -- have a user kill another topology before deploying the squid one, update the canned storm config to have a port available, or have a user update the config and restart Storm and related services.
... View more
11:05 PM
Install the head plugin usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head/1.x I think this should be: 1. sudo /usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head/1.x
... View more