Member since
07-30-2019
3131
Posts
1564
Kudos Received
909
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
112 | 01-09-2025 11:14 AM | |
717 | 01-03-2025 05:59 AM | |
401 | 12-13-2024 10:58 AM | |
429 | 12-05-2024 06:38 AM | |
362 | 11-22-2024 05:50 AM |
08-31-2016
12:26 PM
NiFi 1.x was just officially released yesterday. HDF 2.x has not been released yet (look for it soon). @Jobin George article is still valid for the NiFi 0.x (HDF 1.x) versions. A new article should be written for the new versions.
... View more
08-31-2016
12:23 PM
3 Kudos
@David DN NiFi 1.x (HDF 2.x) versions have gone through a major framework upgrade/change. A multi-tenancy approach has been added that allows users to control the access of users down to the component level. As part of this change, the way the initial admin user is added has changed. In previous NiFi 0.x (HDF 1.x) versions, this was simply done by adding the DN of your first admin user to the authorized-users.xml file. In NiFi 1.x (HDF 2.x) versions you need to set that user DN in the following property in the authorizers.xml file:
<property name="Initial Admin Identity"></property> For those who previously worked with NiFi 0.x (HDF 1.x) versions, you can use an old authorized-users.xml file to seed the new NiFi version's user authorization by setting this property in the same file: <property name="Legacy Authorized Users File"></property> NiFi 1.x (HDF 2.x) version no longer provide new users the ability to "request access". An Admin will need to manually added each users and assign them component level access through the UI.
adding new users is done through the users UI found in the hamburger menu in the upper right corner of the UI. (Remember this can only be done once initial admin as given access as described above). From the Users UI, select the add user icon in the upper right corner : The above UI will appear to add your new users. Supply your kerberos, LDAP, or certificate DN and click "OK" Now that you have added a user you need grant them component level access back on the main NiFi UI. Select the component you which to control access to. In the below example we will select the root canvas: A new "Access Policies" Ui will appear where you need select the access policy you want to add the user to from the pull down menu: Once you select Policy, click on the add user icon in the upper right to grant access to one of the users added earlier. Thanks, Matt
... View more
08-31-2016
11:42 AM
1 Kudo
@boyer NiFi 0.x versions use a whole dataflow revision number when applying changes to anywhere on the canvas. In order invoke a change anywhere (does not matter if you working on different components or within different process groups) on the canvas, the user making the change will need the latest revision number. A user may open a component for editing at which time the current revision number is grabbed. At the same time another use in another browser may do the same. Whichever user makes there change and hits apply first will trigger the revision number to increment. When the second user tries to hit apply, you get the error you described because his change request does not have the current revision. But there is good news.... How this works has changed in NiFi 1.x (HDF 2.x) versions. Revisions are no longer tied to the entire dataflow. While two users will still be unable to make changes to the exact same component at the same time, they will be able to edit different components at the same time without running into the above issue. Thanks, Matt
... View more
08-30-2016
08:54 PM
@Saikrishna Tarapareddy
Just want to make sure I understand completely.
You can establish a connection from your local machine out to your remote NiFi; however, you cannot have yoru remote NiFi connect to your local machine. correct?
In this case you would install a NiFi instance on your local machine and the Remote Process Group (RPG) would be added to the canvas on that local NiFi instance. The NiFi instance running the RPG is acting as the client in the connection between NiFi instances. On your remote NiFi instance, your dataflow that is fetching files from your HDFS would need to route those files to an output port located on the root canvas level. (output and input ports allow FlowFiles to transfer from one level up in the dataflow. So at the root level they allow you to interface with another NiFi.)
For this transfer to work your local instance of NiFi will need to be able to communicate with the http(s) port of your remote NiFi instance (NCM http(s) port if remote is a NiFi cluster). Your local instance will also need to be able to communicate with the configured Site-To-Site (S2S) port on your remote instance (Need to be able to communicate with S2S port on every Node if remote is a NiFi cluster). nifi.properties file # Site to Site properties
nifi.remote.input.socket.host=<remote instance FQDN>
nifi.remote.input.socket.port=<S2S port number> The dataflow on your remote NiFi would look something like this: The dataflow on your local NiFi would look something like this: As you can see in this setup the local NiFi is establishing the connection to the remote NiFi and pulling the data from the output port "outLocal". Thanks,
Matt
... View more
08-29-2016
09:01 PM
@Saikrishna Tarapareddy Your Regex above says the CSV file content must start with Tagname,Timestamp,Value,Quality,QualityDetail,PercentGood
So, it should not route to "Header" unless the CSV starts with that. What is found later in the CSV file should not matter. I tried this and it seems to work as expected. If i removed the '^', then all files matched. Your processor is also loading 1 MB worth of the CSV content for evaluation; however, the string you are searching for is far fewer bytes. If you only want to match against the first line, reduce the size of the buffer from '1 MB' to maybe '60 b'. If I changed the buffer to '60 b' and removed the '^' from the regex above, only the files with the matching header were routed to "header".
Thanks, Matt
... View more
08-29-2016
06:47 PM
2 Kudos
@Saikrishna Tarapareddy The mergeContent processor is not designed to look at the content of the NiFi FlowFiles it is merging. What you will want to do first is use a RouteOnContent processor to route only those Flowfiles where Content contains the headers you want to merge. The 'unmatched' FlowFiles could then be routed elsewhere or auto-terminated.
Thanks, Matt
... View more
08-26-2016
12:00 PM
3 Kudos
@kishore sanchina NiFi only supports user controlled access when it is configured to run securely over HTTPS. The HTTPS configuration of NiFi will require a keystore and truststore is created/provided. If you don't have a corporately provided PKI infrastructure that can provide your with TLS certificates for this purpose, you can create your own. The following HCC article will walk you through manually creating your own: https://community.hortonworks.com/articles/17293/how-to-create-user-generated-keys-for-securing-nif.html Once your NiFi is setup securely, you will need to enable user access to the UI. There are two parts to successful access: 1. User authentication <-- This can accomplished via TLS certificates, LDAP, or Kerberos. Setting up NiFi to use one of these login identity providers is covered here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user-authentication 2. User Authorization <-- This is accomplished through NiFi via the authorized-users.xml file. This process is documented here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#controlling-levels-of-access You will need to manually populate the Authorized-users.xml file with your first "Admin" role user. That Admin user will be able to approve access to other users who have passed the authentication phase and submitted a UI based authorization request. Thanks, Matt
... View more
08-25-2016
08:41 PM
1 Kudo
@INDRANIL ROY
NiFi does not distribute processing of a single file across multiple Nodes in a NiFi cluster. Each Node works on its own set of files. The Nodes themselves are not even aware other nodes exist. They work on what files they have and report their health and status back to the NiFI Cluster Manager (NCM). 1. What format is this file in? 2. What kind of processing are you trying to do against this files content? 3. Can the file be split in to numerous smaller files (Depending on the file content, NiFi may be able to do the splitting)? As an example: A common dataflow involves processing very large log files. The large log file is processed by the SplitText processor to produce many smaller files. These smaller files are then distributed across a cluster of NiFi nodes where the remainder of the processing is performed. There are a variety of pre-existing "split" type processors. Thanks, Matt
... View more
08-25-2016
02:55 PM
4 Kudos
@kishore sanchina The simplest answer to your question is to use the ListFile processor to produce a list of the files from your local filesystem, feed that to a fetchFile processor that will pickup the content and then pass them to a PutHDFS processor to send them to your HDFS. The listFile processor will maintain state based on lastModified time on the files to ensure the files are not listed more then once. If you right click on either of these NiFi processors you can select "usage" from the displayed context menu to get more details on the configuration of each of these. Thanks, Matt
... View more
08-25-2016
02:00 PM
@INDRANIL ROY The massive size of your file, ListSFTP/FetchSFTP may not be the best approach. Let me ask a few questions: 1. Are you picking up numerous files of this multi-TB size or are we talking about a single file? 2. Are you trying to send the same TB file to every Node in your cluster or is each node going to receive a completely different file? 3. Is the directory where these files are originally consumed from a local disk or a network mounted disk?
... View more