Member since 
    
	
		
		
		03-26-2024
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                35
            
            
                Posts
            
        
                18
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		04-26-2024
	
		
		03:04 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Here is the configuration for my MergeContent Processor        
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-25-2024
	
		
		11:01 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am fetching files from a particular HDFS path and using a MergeContent processor to merge all the fetched files. Then, I transfer them to an SFTP server using a PutSFTP Processor. There are currently 20 files present in the path, with a total file size of 1.2 GB (This may vary in my production environment, ranging around ~300GB).  Initially, my MergeContent processor handled 1GB of file size (14 out of 20 files), merging and transferring to the SFTP server. Later, it picked up the remaining 0.2GB of files(the remaining 6 files) and transferred another file to my SFTP server.  I updated the queue limit size to 2GB for the MergeContent processor incoming connection, and then it merged all 20 files and copy a 1.2GB file at once.  In another flow, I have a FetchHDFS-->PutSFTP processor, which copies a single file with a size exceeding 100GB to the SFTP server. The Back Pressure Size Threshold is set to 1GB, and it's working. I am wondering why it is not working in the MergeContent Processors  Could you please advise on the appropriate configuration settings for the MergeContent processor? Every day, my total file size may vary from 10 GB to 300GB. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache NiFi
			
    
	
		
		
		04-23-2024
	
		
		07:56 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thank you, @MattWho  for providing timely responses and quick solutions to the queries. You are really helping the community grow. Hats off to you. Appreciate it 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-22-2024
	
		
		11:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @MattWho   Apologies!  I realized that the failed FlowFiles have the same 'filename' attribute as another FlowFile that has already been transferred to the target SFTP server.  There are 20 files available in my HDFS path, and once the file is fetched, I am updating the 'filename' attribute using the below expression. However, one or two files are getting the same 'filename,' which is causing the error 'failed to rename dot-file':  ${ExtractName}_${now():format('yyyyMMddHHmmssSSS')}.txt  Thank you for providing inputs to identify the root cause of this issue. Now, could you please suggest the best approach to avoid this scenario, as even with milliseconds, it is getting the same filename?   As you suggested earlier, can we go with "Run Duration"  setting for putSFTP processor    Thank you 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-22-2024
	
		
		11:49 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @MattWho   Thanks again for your suggestions  The failures always occur during the renaming of dot files.  There is no process consuming the file once it is placed on the SFTP server, and there is no chance that another process is consuming the dot files. None of the queued FlowFiles have the same 'filename' attribute as another FlowFile or a file already present on the target SFTP server.  Unfortunately, we don't have permission to view the SFTP logs on our Linux server. I will connect with the admin team to obtain sample logs.  PFB my putSFTP Processor configuration                 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-18-2024
	
		
		12:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @MattWho   Thank you for the great suggestions and super helpful information.  Here are the results of what I tried:   I set the Run Schedule to 0 seconds and stopped the PutSFTP processor. After all 20 flowfiles were queued up, I started it again.  Result: Out of 20 flow files, 1 failed.   I set the Run Schedule to 0 seconds and let the flow run with all processors started (Here also, all 20 flow files came almost same time)  Result: Out of 20 flow files, 2 failed.  I updated the Run Schedule of the PutSFTP processor from 0 seconds to 30 seconds.  Result: No failures, all 20 flow files passed.  I updated the "Run Duration" to 500ms.  Result: No failures, all 20 flow files passed.   Could you please suggest the best approach to address this scenario? Option 3 or 4 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-18-2024
	
		
		02:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @TimothySpann   If we introduce any "Run Schedule" delay for the putSFTP processor will it help to fix this issue ? I mean to change the  "Run Schedule" from 0 seconds to 30 seconds or something?  There is no network delay and regarding the RAM size, we are yet to hear back from the platform team. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-15-2024
	
		
		04:40 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @TimothySpann      Please find below the requested info.     Operating system:                   Linux  Java version :               Nifi Server:                        openjdk version "11.0.22" 2024-01-16 LTS                        OpenJDK Runtime Environment (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS)                       OpenJDK 64-Bit Server VM (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS, mixed mode, sharing)                 SFTP Server:                       openjdk version "1.8.0_402"                          OpenJDK Runtime Environment (build 1.8.0_402-b06)                       OpenJDK 64-Bit Server VM (build 25.402-b06, mixed mode)      NiFi version:                           Cloudera Flow Management (CFM) 2.1.5.1027                         1.18.0.2.1.5.1027-2 built 02/09/2023 22:16:12 CST                        Tagged nifi-1.18.0-RC4                        Powered by Apache NiFi 1.18.0  File system type:                        HDFS      Sftp server version :                    OpenSSH_7.4p1, OpenSSL 1.0.2k-fips     Type and size of the HDFS files:                       I am trying to transfer files ranging in size from 300KB to 800KB. Typically, my HDFS path contains a total of 20 files. In production, the file sizes may vary from 300MB to 600MB, and the total file count would still be 20  And I am running on a Nifi Clustor      I am intermittently facing this failure  'Failed to rename dot-file.' for one or two files while preforming PutSFTP      The first PutSFTP processor is used to transfer the actual file and the second one is used to transfer the stats file corresponding to that file like file name, size, row count etc.   I can limit the second  PutSFTP  processor to transfer it once with all the 20 files details. ie, Transfer one stats file with the details of all 20 files. Can we store like this info in a variable line by line and then send at the end ?   FileName~RowCount~FileSize  file1~100~1250  file2~200~3000  The above will also satisfy my requirement instead of multiple stats files for the second PutSFTP processor.    Could you pleae some inputs on this issue.   Thank you     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-10-2024
	
		
		02:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 My requirement is to retrieve the total number of files in a given HDFS directory and based on the number of files proceed with the downstream flow  I cannot use the ListHDFS processor as it does not allow inbound connections. The GetHDFSFileInfo processor generates flowfiles for each HDFS file, causing all downstream processors to execute the same number of times.  I have observed that we can use ExecuteStreamCommand to invoke a script and execute HDFS commands to get the number of files. I would like to know if we can obtain the count without using a script? Or if there is any other option available besides the above. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache NiFi
- 
						
							
		
			HDFS
			
    
	
		
		
		04-09-2024
	
		
		04:33 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 1) Initially, I faced the "NiFi PutSFTP failed to rename dot file issue" only when the child processor was configured with "Outbound Policy = Batch Output". It worked without the child processor group.  2) I modified the PutSFTP failure retry attempt to 3, and it fixed the issue.  3) Later, I introduced a RouteOnAttribute after the FetchHDFS processor for some internal logic implementation, and the PutSFTP error started again.   4) This time, I updated the "Run Schedule" of the PutSFTP processor from 0Sec to 3 Sec, and it again fixed the issue.   5) I have a requirement to transfer stats of each file (with file name, row count, file size) etc. So, I introduced one more PutSFTP processor, and the issue popped up again.   6) Finally, I made the following changes to both of my PutSFTP processors:         a) Added PutSFTP failure retry attempt to 3.         b) Modified the "Run Schedule" of the first PutSFTP Processor to "7 Sec".         c) Modified the "Run Schedule" of the second PutSFTP Processor to "10 Sec".     Now it is working fine. Are we getting this issue because of 20 flowfiles processing at a time ? Could you please suggest if this is the right way to fix the "NiFi PutSFTP failed to rename dot file issue"? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        






