Member since 
    
	
		
		
		07-29-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                535
            
            
                Posts
            
        
                141
            
            
                Kudos Received
            
        
                103
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 7709 | 12-18-2020 01:46 PM | |
| 5038 | 12-16-2020 12:11 PM | |
| 3840 | 12-07-2020 01:47 PM | |
| 2503 | 12-07-2020 09:21 AM | |
| 1633 | 10-14-2020 11:15 AM | 
			
    
	
		
		
		01-09-2019
	
		
		12:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							They're used for spill to disk - see https://www.cloudera.com/documentation/enterprise/latest/topics/impala_scalability.html#spill_to_disk
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-02-2019
	
		
		10:57 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							UDFs can do a lot of things because they run with the same privileges as the Impala process. However, doing things other than the usual computations in the UDF, like accessing filesystems or external services, can compromise the performance and stability of your system. So you do this at your own risk. In the future we may lock down UDFs more and prevent them from doing things like accessing HDFS.
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-31-2018
	
		
		01:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 On CDH5.15 in most cases they won't hold onto resources in admission control, unless the query isn't cancelled and the client (i.e. Hue) doesn't fetch all of the results.     Enabling the timeouts suggested by Eric helps ensure that queries get cancelled in timely manner 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-28-2018
	
		
		07:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It's unlikely that the query is executing that long. Most likely the client you are using is delayed in closing the query. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-17-2018
	
		
		01:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							I don't know too much about that unfortunately.
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-17-2018
	
		
		11:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Big     I checked on our latest build and it works for me - see below. Are you sure that you're not trying to query a table with a DATE type column?     [localhost:21000] default> create table foo2 (`date` int);
Query: create table foo2 (`date` int)
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 1.19s
[localhost:21000] default> select distinct `date` from foo2;
Query: select distinct `date` from foo2
Fetched 0 row(s) in 0.12s 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-08-2018
	
		
		10:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I took a quick look at the Impyla code and rowcount() always returns -1 and the other two methods you mention are not implemented: https://github.com/cloudera/impyla/     At the moment Impyla isn't officially part of CDH - it was developed by one of our data scientists and open sourced for the benefit of the community - all of the documentation and so on is just in that github repo. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-08-2018
	
		
		10:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Actually, scratch what I just said - that advice applies if the query is stuck in the FINISHED state. If it's stuck in the RUNNING state, it means the query is just taking a long time to produce any results. So you're probably getting a bad query plan on one cluster that is extremely slow to execute. E.g. the order of the joins chosen by the planner is inefficient. Usually computing stats on all the tables will improve the query plan.
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-08-2018
	
		
		10:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 There's no relation between resource reservations and query states.     There's probably two things going on:   You're getting different query plans on the two clusters - either the data or table definitions is different or the stats are missing or out-of-date on one cluster  The query is being kept running because the output rows are not all fetched by hue? This can happen if the query returns more than a page of rows and the user does not scroll through the whole result set - the issue is that Hue only fetches results on demand and Impala keeps the query running until the last row is fetched by Hue.  How many rows are being returned from the query. We're looking at making this more robust - the scenario is avoidable.      As a mitigation we usually suggest setting an "idle query timeout" in Cloudera manager to automatically cancel queries that have been hanging around for a while with no client activity.     Edit: second observation was wrong. See my next post. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-28-2018
	
		
		10:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @alexmc6 I think there's (understandably) some misunderstanding of what the different mechanisms there do.     Memory estimates only play a role if you set "Max Memory" and leave "Default Query Memory Limit" unset or set to 0. I always recommend against that mode for exactly the reason you mentioned. 
						
					
					... View more