Member since 
    
	
		
		
		07-30-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                3379
            
            
                Posts
            
        
                1616
            
            
                Kudos Received
            
        
                998
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 80 | 10-20-2025 06:29 AM | |
| 220 | 10-10-2025 08:03 AM | |
| 184 | 10-08-2025 10:52 AM | |
| 170 | 10-08-2025 10:36 AM | |
| 241 | 10-03-2025 06:04 AM | 
			
    
	
		
		
		08-31-2016
	
		
		11:42 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@boyer NiFi 0.x versions use a  whole dataflow revision number when applying changes to anywhere on the canvas.  In order invoke a change anywhere (does not matter if you working on different components or within different process groups) on the canvas, the user making the change will need the latest revision number.  A user may open a component for editing at which time the current revision number is grabbed.  At the same time another use in another browser may do the same.  Whichever user makes there change and hits apply first will trigger the revision number to increment.  When the second user tries to hit apply, you get the error you described because his change request does not have the current revision.  But there is good news....  How this works has changed in NiFi 1.x (HDF 2.x) versions. Revisions are no longer tied to the entire dataflow.  While two users will still be unable to make changes to the exact same component at the same time, they will be able to edit different components at the same time without running into the above issue.  Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-30-2016
	
		
		08:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 
 @Saikrishna Tarapareddy  
 Just want to make sure I understand completely.  
 You can establish a connection from your local machine out to your remote NiFi; however, you cannot have yoru remote NiFi connect to your local machine.  correct?  
 In this case you would install a NiFi instance on your local machine and the Remote Process Group (RPG) would be added to the canvas on that local NiFi instance.  The NiFi instance running the RPG is acting as the client in the connection between NiFi instances.  On your remote NiFi instance, your dataflow that is fetching files from your HDFS would need to route those files to an output port located on the root canvas level. (output and input ports allow FlowFiles to transfer from one level up in the dataflow.  So at the root level they allow you to interface with another NiFi.)  
 For this transfer to work your local instance of NiFi will need to be able to communicate with the http(s) port of your remote NiFi instance (NCM http(s) port if remote is a NiFi cluster).  Your local instance will also need to be able to communicate with the configured Site-To-Site (S2S) port on your remote instance (Need to be able to communicate with S2S port on every Node if remote is a NiFi cluster).  nifi.properties file  # Site to Site properties
nifi.remote.input.socket.host=<remote instance FQDN>
nifi.remote.input.socket.port=<S2S port number>  The dataflow on your remote NiFi would look something like this:      The dataflow on your local NiFi would look something like this:      As you can see in this setup the local NiFi is establishing the connection to the remote NiFi and pulling the data from the output port "outLocal".  Thanks,
Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-29-2016
	
		
		09:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Saikrishna Tarapareddy Your Regex above says the CSV file content must start with Tagname,Timestamp,Value,Quality,QualityDetail,PercentGood
  So, it should not route to "Header" unless the CSV starts with that.  What is found later in the CSV file should not matter.  I tried this and it seems to work as expected. If i removed the '^', then all files matched.  Your processor is also loading 1 MB worth of the CSV content for evaluation; however, the string you are searching for is far fewer bytes.  If you only want to match against the first line, reduce the size of the buffer from '1 MB' to maybe '60 b'.  If I changed the buffer to '60 b' and removed the '^' from the regex above, only the files with the matching header were routed to "header". 
Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-29-2016
	
		
		06:47 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							@Saikrishna Tarapareddy The mergeContent processor is not designed to look at the content of the NiFi FlowFiles it is merging.  What you will want to do first is use a RouteOnContent processor to route only those Flowfiles where Content contains the headers you want to merge.  The 'unmatched' FlowFiles could then be routed elsewhere or auto-terminated.  
Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-26-2016
	
		
		12:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 @kishore sanchina  NiFi only supports user controlled access when it is configured to run securely over HTTPS.  The HTTPS configuration of NiFi will require a keystore and truststore is created/provided.  If you don't have a corporately provided PKI infrastructure that can provide your with TLS certificates for this purpose, you can create your own.  The following HCC article will walk you through manually creating your own:  https://community.hortonworks.com/articles/17293/how-to-create-user-generated-keys-for-securing-nif.html  Once your NiFi is setup securely, you will need to enable user access to the UI.  There are two parts to successful access:  1. User authentication  <-- This can accomplished via TLS certificates, LDAP, or Kerberos.  Setting up NiFi to use one of these login identity providers is covered here:  https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user-authentication  2. User Authorization  <--  This is accomplished through NiFi via the authorized-users.xml file.  This process is documented here:  https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#controlling-levels-of-access  You will need to manually populate the Authorized-users.xml file with your first "Admin" role user.  That Admin user will be able to approve access to other users who have passed the authentication phase and submitted a UI based authorization request.  Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-25-2016
	
		
		08:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@INDRANIL ROY
 NiFi does not distribute processing of a single file across multiple Nodes in a NiFi cluster.  Each Node works on its own set of files.  The Nodes themselves are not even aware other nodes exist. They work on what files they have and report their health and status back to the NiFI Cluster Manager (NCM).  1. What format is this file in?  2. What kind of processing are you trying to do against this files content?  3. Can the file be split in to numerous smaller files (Depending on the file content, NiFi may be able to do the splitting)?  As an example:  A common dataflow involves processing very large log files. The large log file is processed by the SplitText processor to produce many smaller files. These smaller files are then distributed across a cluster of NiFi nodes where the remainder of the processing is performed.  There are a variety of pre-existing "split" type processors.  Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-25-2016
	
		
		02:55 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							@kishore sanchina The simplest answer to your question is to use the ListFile processor to produce a list of the files from your local filesystem, feed that to a fetchFile processor that will pickup the content and then pass them to a PutHDFS processor to send them to your HDFS.  The listFile processor will maintain state based on lastModified time on the files to ensure the files are not listed more then once.      If you right click on either of these NiFi processors you can select "usage" from the displayed context menu to get more details on the configuration of each of these.  Thanks,  Matt 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-25-2016
	
		
		02:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @INDRANIL ROY   The massive size of your file, ListSFTP/FetchSFTP may not be the best approach.  Let me ask a few questions:  1. Are you picking up numerous files of this multi-TB size or are we talking about a single file?  2. Are you trying to send the same TB file to every Node in your cluster or is each node going to receive a completely different file?  3. Is the directory where these files are originally consumed from a local disk or a network mounted disk? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-24-2016
	
		
		03:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Just to clarify on how S2S works when communicating with a target NiFi cluster.  The NCM never receives any data so it cannot act as the load-balancer. When the source NiFi communicates with the NCM, the NCM returns a list of all currently connected nodes and there S2S ports along with the current load on each node to the source NiFi.  It is then the job of the source NiFi RPG to use that information to do a smart load-balanced delivery of data to those nodes. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-24-2016
	
		
		03:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Anything you can do via the browser can be done my making calls to the NiFi-API.  You could either setup an external process to run a couple curl commands to start and they stop the GetTwitter processor in your flow or you could us a couple invokeHTTP processors in your dataflow (configured using the cron scheduling strategy) to start and stop the GetTwitter processor on a given schedule.  Matt
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













