Member since 
    
	
		
		
		04-29-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                192
            
            
                Posts
            
        
                20
            
            
                Kudos Received
            
        
                2
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2224 | 07-14-2017 05:01 PM | |
| 3892 | 06-28-2017 05:20 PM | 
			
    
	
		
		
		09-25-2019
	
		
		10:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The question posted is not a hypothetical one, it is a real use case.     fyi, here is another thread related to partial file consumption; - https://stackoverflow.com/questions/45379729/nifi-how-to-avoid-copying-file-that-are-partially-written  that thread does not suggest the OS automatically takes care of this.      The solution proposed there is to add a time wait between ListFile and FetchFile, but in our case, the requirement is to wait for an indicator file before we start file ingestion;                 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-25-2019
	
		
		09:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello All,     We're using Apache NiFi 1.0.1, I know we're way behind in upgrading.     Our use case is to get files from a local NiFi server mount and write to HDFS; we're using ListFile and FetchFile to achieve this. Some files are huge, so the concern is that NiFi might start to fetch files before they're completely written to the mount, which would cause partial file loads in HDFS. So the solution proposed is, the source system would send us an indicator file (located on a different directory) with a specific name; once we get that file, then we should start fetching the files with FetchFile processor.      So, the question is, how do we build the NiFi dataflow in such a way that FetchFile will only start after the indicator file is received.     Do you have any suggestions on how to achieve this.     Thanks in advance. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache NiFi
			
    
	
		
		
		03-14-2018
	
		
		02:10 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Pranay Vyas  The Hive Export/Import worked well for us. Thanks.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-21-2018
	
		
		04:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello,  We've the following Hive migration scenario where there are several variable/changes, we need to migrate Hive data from Source to Target     Source  Target    Cluster A    Cluster B    HDP 2.5.3  HDP 2.6.2    Hive metastore DB - MySQL   Hive metastore DB - Oracle     Has 7 databases to migrate    No existing data to preserve       Both clusters are on the same network, both have HDP running. What's the most efficient way to migrate existing Hive data to the new cluster.  Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
			
    
	
		
		
		10-23-2017
	
		
		04:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @mkalyanpur how would this be different for a Kerberized HDP environment; I'm having so much trouble connecting to Kerberized HDP 2.5 and 2.6, from NiFi 1.2.0; both PutHiveStreaming and PutHiveQL are not working. For PutHiveQL here is the detail on the error I get - https://community.hortonworks.com/questions/142110/nifi-processor-puthiveql-cannot-connect-to-kerberi.html  For PutHiveStreaming, I get the error below:  2017-10-23 11:32:23,841 INFO [put-hive-streaming-0] hive.metastore Trying to connect to metastore with URI thrift://server.domain.com:9083
2017-10-23 11:32:23,856 INFO [put-hive-streaming-0] hive.metastore Connected to metastore.
2017-10-23 11:32:23,885 INFO [Timer-Driven Process Thread-7] hive.metastore Trying to connect to metastore with URI thrift://server.domain.com:9083
2017-10-23 11:32:23,895 INFO [Timer-Driven Process Thread-7] hive.metastore Connected to metastore.
2017-10-23 11:32:24,730 WARN [put-hive-streaming-0] o.a.h.h.m.RetryingMetaStoreClient MetaStoreClient lost connection. Attempting to reconnect.
org.apache.thrift.TApplicationException: Internal error processing open_txns
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_open_txns(ThriftHiveMetastore.java:3834)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.open_txns(ThriftHiveMetastore.java:3821)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openTxns(HiveMetaStoreClient.java:1841)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152)
	at com.sun.proxy.$Proxy231.openTxns(Unknown Source)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl$1.run(HiveEndPoint.java:525)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.openTxnImpl(HiveEndPoint.java:522)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:504)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:461)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.access$500(HiveEndPoint.java:243)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl$2.run(HiveEndPoint.java:332)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl$2.run(HiveEndPoint.java:329)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:328)
	at org.apache.nifi.util.hive.HiveWriter.lambda$nextTxnBatch$2(HiveWriter.java:259)
	at org.apache.nifi.util.hive.HiveWriter.lambda$null$3(HiveWriter.java:368)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.nifi.util.hive.HiveWriter.lambda$callWithTimeout$4(HiveWriter.java:368)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
<br>  The strange thing is, using the same core-site, hdfs-site, hive-site config files and the same principal and keytab, NiFi can connect to HDFS and HBase without any issues, it's only Hive connection that errors; even using a sample java program to connect to Hive using Kerberos principal and keytab works fine.  Thanks for your time. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-20-2017
	
		
		02:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Andrew Lim thanks for clarifying further. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-20-2017
	
		
		02:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Abdelkrim Hadjidj Thanks for clarifying. It would have been nicer to let the controller services be accessible throughout the UI, regardless of where they were created. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-20-2017
	
		
		02:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello,  In our NiFi instance (1.2.0) we're finding that controller services created from the Controller Settings menu (top right corner in the UI) are not visible/accessible when you try to look for them through a processor; for example, after a HiveConnectionPool Controller service is created through the controller settings menu, it does not show up in PutHiveinQL's "Hive Database Connection Pooling Service" drop down values; also, when a new Controller service is created by selecting the "Create new service..." option from the dropdown values in PutHiveQL processor, that controller service does not show up in the Controller services listing (accessed through the Controller Settings menu).          It seem like it is something to do with user access permissions in the UI; if yes, how can this be corrected.   I'm not familiar with the user access permissions settings, our Admin handles that.  Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache NiFi
			
    
	
		
		
		10-16-2017
	
		
		04:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello,  My requirement is to overwrite (or delete prior to the import) the existing data in an hcatalog table during sqoop import.   It appears hive-overwrite and delete-target-dir arguments don't work for this purpose.  Any suggestions on how to do this.  Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache HCatalog
- 
						
							
		
			Apache Sqoop
			
    
	
		
		
		08-09-2017
	
		
		04:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @mel mendoza, in my case, after splitting the files, I was doing further processing on the split files; but if your requirement is to store/write the split files, you could use PutFile or PutHDFS to write to local file system or HDFS. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				













