Member since 
    
	
		
		
		07-17-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                23
            
            
                Posts
            
        
                1
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 4747 | 05-08-2018 06:11 AM | 
			
    
	
		
		
		05-12-2018
	
		
		01:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @adam chui
  We are not able execute commands in ExecuteStreamCommand processor like   [bash] cat <filename>|grep <search string>  (or)   [bash] ls |wc -l  But you can use QueryRecord processor and write the sql query(to filter or count..) then the query is going to be executed on the contents of the flowfile.  Take a look into this link for more details regarding Query Record processor. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-10-2018
	
		
		11:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @adam chui
  Sure..  I have created a directory called nifi_test in tmp directory.  [bash tmp]$ mkdir nifi_test<br>[bash tmp]$ cd nifi_test/
[bash  nifi_test]$ touch test.txt
[bash nifi_test]$ touch test1.txt
[bash nifi_test]$ touch test2.txt
[bash  nifi_test]$ ll
total 0
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test1.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test2.txt
-rw-r--r-- 1 nifi nifi 0 May 10 19:16 test.txt<br>  Make sure nifi having access to pull the files in the directory.  Let's assume you are having dynamic generated directory attribute value as /tmp/nifi_test/ in middle of the flow.  Now we need to fetch all the files that are in /tmp/nifi_test directory  Flow:-      GenerateFlowFile configs:-  i have added new property as  directory
  /tmp/nifi_test  now i'm having a flowfile with directory attribute with /tmp/nifi_test as a value.  ExecuteStreamCommand configs:      Now i'm passing directory attribute as command attribute and listing all the files in the directory(/tmp/nifi_test)  SplitText configs:-  When you are having more than one file in the directory use this processor to split into individual flowfile  Change the below property value   Line Split Count
  1  Extract Text Configs:-  we need to dynamically pull all the files from the directory so use extract text processor add new property as  filename
  (.*)  in this processor we are extracting flowfile content and keeping for the filename attribute  Now we are having directory and filenames in the directory as attributes now.  Fetch File Configs:-      In File to Fetch property we are using directory and filename attributes to fetch the file/s from the directory, at the end flow screenshot you can see 3 files got fetched from the directory.  By following this way we are able to pull files middle of the flow.  I have added my flow.xml save/upload xml to your nifi istance and test it out.  fetch-files-189935.xml 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-08-2018
	
		
		06:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This is fixed by doing below:  
 Set default encoding using the JAVA_TOOL_OPTIONS environment variable: (nifi-env.sh) <-- currently this one not implemented    export JAVA_TOOL_OPTIONS=-Dfile.encoding=utf8
   
 Add default encoding parameter to NiFi’s bootstrap.conf file:    java.arg.8=-Dfile.encoding=UTF8
   Of course, adjust the argument’s number according to your configuration.  And  in nifi-env.sh     export LANG="en_US.UTF-8"  export LC_ALL="en_US.UTF-8"         
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-06-2018
	
		
		05:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@adam chui As per the documentation of SelectHiveQL processor states that If it is triggered by an incoming FlowFile, then attributes of that FlowFile will be available when evaluating the select query. FlowFile attribute 'selecthiveql.row.count' indicates how many rows were selected. But you can file a jira addressing the issue found.  As a work around to fix this issue You can use PutDistributeCacheMap processor to keep all your attributes in cache server and fetch the attributes from cache server using FetchDistributeCacheMap processor.  Sample Flow Example:-      i'm using GenerateFlowFile processor and adding 3 attributes to the flowfile  attr1
   56 
  attr2
   67 
  attr3
   89   ReplaceText Processor:-  Search Value
   (?s)(^.*$) 
  Replacement Value
   ${allAttributes("attr1","attr2","attr3"):join("|")}   //i'm using allAttributes and keeping all the attributes with "|" pipe delimiter(output flowfile will be 1|2|3)  Maximum Buffer Size
   1 MB //needs to change the size if the content is more than 1MB size. 
  Replacement Strategy
   Always Replace 
  Evaluation Mode
   Entire text   Use this link to evaluate multiple attributes.  UpdateAttribute:-  We are using this processor to change the filename of flowfile to UUID because We cannot not refer to use UUID as cache identifier reason is output from SelectHiveQL processor is having same filename but different UUID(i.e until selecthiveq processor flowfile having one uniqueid after selecthiveql processor different uuid).**  Add new property as  filename  ${UUID()}  PutDistributeCache processor:-  Configure DistributedMapCacheServer,DistributedMapCacheClientService and enable them(you need to change cache number of entriesas per your needs,persistence directory if not mentioned then all the entries will be stored in memory).  Cache Entry Identifier
  ${filename}  Now we have changed the flowfile content and cached the output content with the filename.  SelectHiveQL processor:-  Feed success relation to SelectHiveQL processor once the processor outputs flowfile with content of the flowfile then feed the success relationship to   FetchDistributeCacheMap:-  Configs:-      Cache Entry Identifier
   ${filename:substringBefore('.')} //because based on output ff format we are going to have .avro/.csv extensions 
  Distributed Cache Service
   DistributedMapCacheClientService 
  Put Cache Value In Attribute
   cache_value //the cached content will be put in to this attribute instead putting into flowfile content 
  Max Length To Put In Attribute
   256 //needs to change the value if max length is more than this value   Output:-      Flowfile will have attribute called cached.value then you can rebuild all your attributes by using getDelimitedField function   ${cached.value:getDelimitedField(1, '|')} //will give attr1 field  Even without rebuilding all the attributes again by using above expression language you can directly pull the required attribute value and use them in your flow.  I have attached my sample flow.xml below, save and upload the xml to your instance and change the as per your needs.  selecthiveql-attributes-188338.xml  -  If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        






