Member since 
    
	
		
		
		02-08-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                28
            
            
                Posts
            
        
                2
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 3889 | 06-13-2019 12:19 AM | 
			
    
	
		
		
		09-24-2019
	
		
		01:46 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @eMazarakis, later releases do not support asterisk either, it will be treated as a literal. The expressions that are available can be found here in chapter 'To drop or alter multiple partitions'.     Previously, I was referring to the intention behind "part_col='201801*' ", it suggests that the desired outcome of this expression would be to remove all data from January 2018 in one operation. However, as it is not possible in CDH 5.9, I was proposing to choose a different partition strategy if multiple partitions have to be dropped frequently and the size of the data allows.  For example, if after ingestion only 1 analytic query is executed on the data, then the days have to be dropped one-by-one, which is 32 operations. Therefore, if the size of the data allows, the number of operations could be reduced to 2 with a different partition strategy where the table is partitioned by month. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-20-2019
	
		
		10:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							    So it looks like column specific is only on a table without partitions (non-incremental)     @hores that's incorrect, non-incremental compute stats works on partitioned tables and is generally the preferred method for collecting stats on partitioned tables.     We've generally tried to steer people away from incremental stats because of the size issues on large tables,     It would also be error-prone to use correctly and complex to implement - what happens if you compute incremental stats with different subsets of the columns? You can end up with different subsets of the columns on different partitions and then you have to somehow reconcile it all each time. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-14-2019
	
		
		05:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							For number 2, ANY changes outside of Impala, you will need INVALIDATE METADATA, or if new data added, then REFRESH will do.    Work is underway to improve it: https://issues.apache.org/jira/browse/IMPALA-3124    Cheers  Eric
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-09-2019
	
		
		07:09 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							Hi,    "HiveServer2 Enable Impersonation is setting to TRUE" is probably the reason. When Impersonation is true, it means Hive will impersonate as the end user who runs the query to submit jobs. Your ACL output showed that the directory is owned by "hive:hive" and as @Tomas79 found out, you have sticky bit set, so if hive needs to impersonate as the end user, the end user who runs the query will not be able to delete the path as he/she is not the owner. If impersonation is OFF, then HS2 will run query as "hive" user (the user that runs HS2 process), then you should not see such issue.    I assume you have no sentry? As sentry will require Impersonation to be OFF on HS2 side, so that all queries will be running under "hive" user.    To test the theory, try to remove the sticky bit on this path and drop again in Hive.    Cheers  Eric
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-13-2019
	
		
		12:19 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Consult  I found the solution. The sqoop command creates a YARN process, type MAPREDUCE.     So if we only kill the processes through unix shell, this YARN process will continue to run at the background.     So from the cloudera manager, we go to YARN --> Applications and then we kill the YARN process.        . 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-15-2019
	
		
		04:56 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Dear @AnisurRehman   You can import data from RDBMS to HDFS only with SQOOP. Then If you want to manipulate this table through Impala-Shell then you only need to run the following command from a pc where Impala is installed.  impala-shell -d db_name -q "INVALIDATE METADATA tablename";     You have to do INVALIDATE because your table is new for Impala daemon metadata.   Then if you append new data-files to the existing tablename table you only need to do refesh, the command is   impala-shell -d db_name -q "REFRESH tablename";  Refresh due to the fact that you do not want the whole metadata for the specific table, only the block location for the new data-files.     So after that you can quey the table through Impala-shell and Impala query editor.    
						
					
					... View more