About MattWho

MattWho · ‎08-31-2016

@Sami Ahmad For starters, I have to agree that the "generate_logs.py" script is not being used in that NiFi template anywhere. The NiFi flow itself has been built to generate some fake log data. Invalid components: Components like NiFi processors, input ports, output ports, controller services, and reporting tasks all have minimum defined requirements that must be met before they are in a "valid" state. Only components that are in a Valid state can be started. Floating your cursor over the invalid icon on a component will show why it is not valid. The Data Enrichment process group in this template has two output ports that have no defined connections making the invalid. Despite the warning you were presented with, all valid components should have been started. You can fix the issue by creating the two missing output connections: So here you can see I added a new processor (UpdateAttribute with success checked for auto-terminate) and dragged a connection from the Data Enrichment process group to it twice. Once for each invalid output port it contained (Warn logs and Info logs). Now the process group no longer reflects and invalid components in it. I am unable to see the screenshot you attached. I did start the "Log Generate" process group without making any changes to it and do see data being produced. I see data being queued in several places in the dataflow. If you are not seeing any data queued, check your NiFi's nifi-app.log for errors. Also check the various GenerateFlowFile processor to see if they are producing any bulletins (This icon will be displayed on the processor if it is: ) Floating over the bulletin will display a log message that may indicate the issue. Thanks, Matt

MattWho · ‎08-31-2016

@Sami Ahmad Instead of left clicking on the link in step three, right click and select "Save Link As..." option to save the xml template so it can be imported in to your NiFi. The dataflow template will show you all the components needed for this workflow. I believe the intent of this tutorial was not to teach users how to use the NiFi UI, but rather how to use a combination of specific NiFi components to build accomplish a particular workflow. Using the NIFi UI dataflow tools you can recreate the workflow as a UI dataflow building exercise. Thanks, Matt

MattWho · ‎08-31-2016

@INDRANIL ROY That is the exact approach I suggested in response to the above thread we had going. Each Node will only work on the FlowFile it has in its possession. By splitting this large TB file into many smaller files, you can distribute the processing load across your downstream cluster. The distribution of FlowFiles via the RPG works as follows. The RPG communicates with the NCM of your NiFi cluster. The NCM returns back to the source RPG a list of available Nodes and there S2S ports in its cluster along with the current load on each. It is then the responsibility of the RPG to do smart load-balancing of the data in its incoming queue to these Nodes. Nodes with higher load will get fewer FlowFiles. The load balancing is done in batches for efficiency, so under light load you may not see an exact balanced delivery, but under higher FlowFile volumes you will see a balanced delivery over the 5 minutes delivery statistics. Thanks, Matt

MattWho · ‎08-31-2016

NiFi 1.x was just officially released yesterday. HDF 2.x has not been released yet (look for it soon). @Jobin George article is still valid for the NiFi 0.x (HDF 1.x) versions. A new article should be written for the new versions.

MattWho · ‎08-31-2016

@David DN NiFi 1.x (HDF 2.x) versions have gone through a major framework upgrade/change. A multi-tenancy approach has been added that allows users to control the access of users down to the component level. As part of this change, the way the initial admin user is added has changed. In previous NiFi 0.x (HDF 1.x) versions, this was simply done by adding the DN of your first admin user to the authorized-users.xml file. In NiFi 1.x (HDF 2.x) versions you need to set that user DN in the following property in the authorizers.xml file: <property name="Initial Admin Identity"></property> For those who previously worked with NiFi 0.x (HDF 1.x) versions, you can use an old authorized-users.xml file to seed the new NiFi version's user authorization by setting this property in the same file: <property name="Legacy Authorized Users File"></property> NiFi 1.x (HDF 2.x) version no longer provide new users the ability to "request access". An Admin will need to manually added each users and assign them component level access through the UI. adding new users is done through the users UI found in the hamburger menu in the upper right corner of the UI. (Remember this can only be done once initial admin as given access as described above). From the Users UI, select the add user icon in the upper right corner : The above UI will appear to add your new users. Supply your kerberos, LDAP, or certificate DN and click "OK" Now that you have added a user you need grant them component level access back on the main NiFi UI. Select the component you which to control access to. In the below example we will select the root canvas: A new "Access Policies" Ui will appear where you need select the access policy you want to add the user to from the pull down menu: Once you select Policy, click on the add user icon in the upper right to grant access to one of the users added earlier. Thanks, Matt

MattWho · ‎08-31-2016

@boyer NiFi 0.x versions use a whole dataflow revision number when applying changes to anywhere on the canvas. In order invoke a change anywhere (does not matter if you working on different components or within different process groups) on the canvas, the user making the change will need the latest revision number. A user may open a component for editing at which time the current revision number is grabbed. At the same time another use in another browser may do the same. Whichever user makes there change and hits apply first will trigger the revision number to increment. When the second user tries to hit apply, you get the error you described because his change request does not have the current revision. But there is good news.... How this works has changed in NiFi 1.x (HDF 2.x) versions. Revisions are no longer tied to the entire dataflow. While two users will still be unable to make changes to the exact same component at the same time, they will be able to edit different components at the same time without running into the above issue. Thanks, Matt

MattWho · ‎08-30-2016

@Saikrishna Tarapareddy Just want to make sure I understand completely. You can establish a connection from your local machine out to your remote NiFi; however, you cannot have yoru remote NiFi connect to your local machine. correct? In this case you would install a NiFi instance on your local machine and the Remote Process Group (RPG) would be added to the canvas on that local NiFi instance. The NiFi instance running the RPG is acting as the client in the connection between NiFi instances. On your remote NiFi instance, your dataflow that is fetching files from your HDFS would need to route those files to an output port located on the root canvas level. (output and input ports allow FlowFiles to transfer from one level up in the dataflow. So at the root level they allow you to interface with another NiFi.) For this transfer to work your local instance of NiFi will need to be able to communicate with the http(s) port of your remote NiFi instance (NCM http(s) port if remote is a NiFi cluster). Your local instance will also need to be able to communicate with the configured Site-To-Site (S2S) port on your remote instance (Need to be able to communicate with S2S port on every Node if remote is a NiFi cluster). nifi.properties file # Site to Site properties nifi.remote.input.socket.host=<remote instance FQDN> nifi.remote.input.socket.port=<S2S port number> The dataflow on your remote NiFi would look something like this: The dataflow on your local NiFi would look something like this: As you can see in this setup the local NiFi is establishing the connection to the remote NiFi and pulling the data from the output port "outLocal". Thanks, Matt

MattWho · ‎08-29-2016

@Saikrishna Tarapareddy Your Regex above says the CSV file content must start with Tagname,Timestamp,Value,Quality,QualityDetail,PercentGood So, it should not route to "Header" unless the CSV starts with that. What is found later in the CSV file should not matter. I tried this and it seems to work as expected. If i removed the '^', then all files matched. Your processor is also loading 1 MB worth of the CSV content for evaluation; however, the string you are searching for is far fewer bytes. If you only want to match against the first line, reduce the size of the buffer from '1 MB' to maybe '60 b'. If I changed the buffer to '60 b' and removed the '^' from the regex above, only the files with the matching header were routed to "header". Thanks, Matt

MattWho · ‎08-29-2016

@Saikrishna Tarapareddy The mergeContent processor is not designed to look at the content of the NiFi FlowFiles it is merging. What you will want to do first is use a RouteOnContent processor to route only those Flowfiles where Content contains the headers you want to merge. The 'unmatched' FlowFiles could then be routed elsewhere or auto-terminated. Thanks, Matt

MattWho · ‎08-26-2016

@kishore sanchina NiFi only supports user controlled access when it is configured to run securely over HTTPS. The HTTPS configuration of NiFi will require a keystore and truststore is created/provided. If you don't have a corporately provided PKI infrastructure that can provide your with TLS certificates for this purpose, you can create your own. The following HCC article will walk you through manually creating your own: https://community.hortonworks.com/articles/17293/how-to-create-user-generated-keys-for-securing-nif.html Once your NiFi is setup securely, you will need to enable user access to the UI. There are two parts to successful access: 1. User authentication <-- This can accomplished via TLS certificates, LDAP, or Kerberos. Setting up NiFi to use one of these login identity providers is covered here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user-authentication 2. User Authorization <-- This is accomplished through NiFi via the authorized-users.xml file. This process is documented here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#controlling-levels-of-access You will need to manually populate the Authorized-users.xml file with your first "Admin" role user. That Admin user will be able to approve access to other users who have passed the authentication phase and submitted a UI based authorization request. Thanks, Matt

Online	Online
Last Visited	‎02-03-2026 05:07 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-03-2026 05:07 PM
Posts	3,434
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: nifi 2.6 registry security scan results

Re: nifi connector broken

Re: issues with NiFi tutorial

Re: Load balancing while the fetching of file fro...

Re: How to enable User Authentication with Kerbero...

Re: How to enable User Authentication with Kerbero...

Re: This NiFi instance has been updated by 'anonym...

Re: How can we fetch files from a HDFS to local ma...

Re: Merge files Based on file headers.?

Re: Merge files Based on file headers.?

Re: how to integrate NiFi and ldap and how to add ...