Created on 09-13-2016 09:45 PM
In this tutorial, we will learn how configure MiNiFi to send data to NiFi:
For a primer on HDF, you can refer to the tutorials here Tutorials and User Guides
NOTE: The above installation guide is for HDF 1.2.0.1, this is the version that matches Apache MiNiFi 0.0.1. Although HDF 2.0 may work, for this exercise -- it is not recommended at this time.
Now that you have NiFi up and running it is time to download and install MiNiFi.
Figure 1. MiNiFi download page
For this tutorial I have downloaded the tar.gz on a Mac as shown above in
/Users/apsaltis/minifi-to-nifi
The image below show MiNiFi downloaded and installed in this directory:
NOTE: Before starting NiFi we need to enable Site-to-Site communication. To do that do the following:
To
Now that we have NiFi up and running and MiNiFi installed and ready to go, the next thing to do is to create our data flow. To do that we are going to first start with creating the flow in NiFi. Remember if you do not have NiFi running execute the following command:
Now we should be ready to create our flow. To do this do the following:
Figure 2. Empty NiFi Canvas
Figure 3. Adding the Input Port
Figure 4. Adding the LogAttribute processor
Figure 5. NiFi Flow
Figure 6. Adding the Remote Processor Group
Figure 7. Adding GenerateFlowFile Connection to Remote Processor Group
Figure 8. Adding GenerateFlowFile Connection to Remote Processor Group
Figure 10. Template button
Figure 11. Saving a template
bin/config.sh transform <INPUT_TEMPLATE> <OUTPUT_FILE>
For example:
bin/config.sh transform MiNiFi_Flow.xml config.yml
config.yml
to the minifi-0.0.1/conf
directory. That is the file that MiNiFi uses to generate the nifi.properties file and the flow.xml.gz for MiNiFi.You should be able to now go to your NiFi flow and see data coming in from MiNiFi.
Created on 11-02-2016 06:26 PM
I might suggest we make a few changes to this article:
1. The link you have for installing HDF talks about installing HDF 2.0. HDF 2.0 is based off Apache NiFi 1.0. Since MiNiFi is built from Apache NiFi 0.6.1, the dataflows built and templated for conversion into MiNiFi YAML files must also be built using an Apache 0.6 based NiFi install. (I see in your example above you did just that but this needs to be made clear)
2. I would never recommend setting nifi.remote.input.socket.host= to "localhost". When a NiFi or MiNiFi connects to another NiFi via S2S, the destination NiFi will return the value set for this property along with the value set for nifi.remote.input.socket.port=. In your example that means the source MiNiFi would then try to send FlowFiles to localhost:10000. This is ONLY going to work if the destination NIFi is located on the same server as MiNiFi.
3. You should also explain why you are changing nifi.remote.input.secure= from true to false. Changing this is not a requirement of MiNiFi, it is simply a matter of preference (If set to true, both MiNiFi (source) and NiFi (destination) must be setup to run securely over https). In your example you are working with http only.
4. While doable, one should never route the "success" relationship from any processor back on to itself. If you have reached the end of your dataflow, you should auto-terminate the "success" relationship.
5. I am not clear what you are telling me to do based on this line under step 5:
6. When using the GenerateFlowFile processor in an example flow it is important to recommend that user set a run schedule other then "0 sec". Since MiNiFi is Apache 0.6.1 based there is no default backpressure on connections and with a run schedule of "0 sec" it is very likely this processor will produce FlowFiles much faster then they can be sent across S2S. This will eventual fill the hard drive of the system running MiNiFi. An even better recommendation would be to make sure they set back pressure between the GenerateFlowFile processor and the Remote Process Group (RPG). That way even if someone stops the NiFi and not the MiNiFi they don't fill their MiNiFI hard drive.
Thanks,
Matt
Created on 11-03-2016 12:18 AM
Thanks for the feedback @mclark all of your suggestions should now be apparent in the content. Thanks again for the input.
Created on 05-25-2017 09:32 AM
Created on 05-25-2017 10:53 AM
@Roger Young -- You are correct once the template is downloaded you should be able to delete it and the related part of the flow. I am not sure off-hand why you are not seeing data flow form MiNiFi. I am actually using this exercise in a class I am teaching today and will be sure to test this and see what happens.
What version of MiNiFi and NiFi are you using?
Created on 05-25-2017 11:13 AM
hi, i am using nifi-1.2.0 and minifi-0.2.0.
Created on 05-25-2017 11:16 AM
Below is the config.yml file in minifi. There is no mention of processors there, maybe ive done something wrong
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the \"License\"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an \"AS IS\" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. MiNiFi Config Version: 3 Flow Controller: name: MiNiFi Flow comment: '' Core Properties: flow controller graceful shutdown period: 10 sec flow service write delay interval: 500 ms administrative yield duration: 30 sec bored yield duration: 10 millis max concurrent threads: 1 FlowFile Repository: partitions: 256 checkpoint interval: 2 mins always sync: false Swap: threshold: 20000 in period: 5 sec in threads: 1 out period: 5 sec out threads: 4 Content Repository: content claim max appendable size: 10 MB content claim max flow files: 100 always sync: false Provenance Repository: provenance rollover time: 1 min Component Status Repository: buffer size: 1440 snapshot frequency: 1 min Security Properties: keystore: '' keystore type: '' keystore password: '' key password: '' truststore: '' truststore type: '' truststore password: '' ssl protocol: '' Sensitive Props: key: '' algorithm: PBEWITHMD5AND256BITAES-CBC-OPENSSL provider: BC Processors: [] Process Groups: [] Funnels: [] Connections: [] Remote Process Groups: [] NiFi Properties Overrides: {}
Created on 05-25-2017 01:09 PM
Hi @apsaltis
I figured out where i went wrong. The config.sh transform command wasnt working as i was on a windows machine. I used config.bat and its working fine now
Created on 06-06-2017 06:48 PM
Thanks for the useful tutorial Marc!
Created on 12-11-2017 08:06 PM
I am not able to move beyond step 6. After creating the remote process group, I am getting an error 'http://127.0.0.1:8080/nifi' does not have any input ports.