Member since 
    
	
		
		
		10-16-2013
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                307
            
            
                Posts
            
        
                77
            
            
                Kudos Received
            
        
                59
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 12382 | 04-17-2018 04:59 PM | |
| 7626 | 04-11-2018 10:07 PM | |
| 4378 | 03-02-2018 09:13 AM | |
| 24552 | 03-01-2018 09:22 AM | |
| 3379 | 02-27-2018 08:06 AM | 
			
    
	
		
		
		10-06-2017
	
		
		04:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Thanks for following up with the solution.     Sorry for the pain, I understand it's somewhat user unfriendly. The explanation for the current behavior goes like this:     Column names are generally case insensitive from the Impala SQL perspective, but HDFS file paths are case sensitive. So it could cause confusion if you had paths like this in HDFS:     YEAR=2000/MONTH=1  year=2000/month=1  Year=2000/Month=1     Are they different partitions? All the same partition? Can one partition point to multiple directories... You see where I am going :). It's just easier to accept one canonical casing.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-06-2017
	
		
		10:04 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Not sure if this is the problem, but you might try using lower case names in the HDFS path, i.e.:     year=2017/month=8/day=2  instead of  YEAR=2017/MONTH=8/DAY=2 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-28-2017
	
		
		12:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 You can do this:     insert overwrite table1 partition(partition_key=1) select * from table1 where partition_key=1;     This process should mostly work as you'd expect.     However, there are few situations where this may cause problems:  - If you run concurrent "refresh" or "invalidate metadata" commands against that table/partition until the insert is complete, some queries may see missing or dupicate data from that partition (fix via refresh after the insert).  - Do not run concurrent "insert overwrite" against the same partition. You may end up with missing/dupicate data in that partition.     If you can guarantee that the above two situations are not a problem for you, then insert overwrite should work just fine.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2017
	
		
		09:22 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Nad1998, that's a different error - it means your 'products' table does not exist or is not visible to Impala (try running 'invalidate metadata products', then retry query). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-24-2017
	
		
		04:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I'm afraid Impala is not yet able to recognize that only two partitions need to be scanned. We're aware of the gap and that specific optimization is tracked by:  https://issues.apache.org/jira/browse/IMPALA-2108     For now, you can manually rewrite your query as suggested in the JIRA as follows:  select id, yyyymmdd, group_id, test from dwh.table where  ((id='1a' and yyyymmdd=20170815 and group_id=1) OR (id='2b' and yyyymmdd=20170811 and group_id=2))  AND  ((yyyymmdd=20170811 and group_id=2) OR (yyyymmdd=20170815 and group_id=1))     or alternatively, use a union:     select id, yyyymmdd, group_id, test from dwh.table where  id='1a' and yyyymmdd=20170815 and group_id=1    union all  select id, yyyymmdd, group_id, test from dwh.table where  id='2b' and yyyymmdd=20170811 and group_id=2    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-23-2017
	
		
		08:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 https://issues.apache.org/jira/browse/IMPALA-1570     That feature is available since Impala 2.8 (CDH 5.11) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-19-2017
	
		
		09:22 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hdfs does not know about partitions. That information is stored in the Hive Metastore as part of the other table metadata.     A partition of a Impala/Hive table points to a directory in Hdfs. The values of partition columns are not stored in data files, they are "stored" in the Hdfs directory structure, e.g.     hdfs://warehouse/mytable/year=2017/month=6     might be a directory of a partitioned table "mytable" with partition columns year and month. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2017
	
		
		06:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes, very likely there will be a performance difference, but it's hard to say which one will be better without concrete examples. 
						
					
					... View more