Member since 
    
	
		
		
		05-02-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                360
            
            
                Posts
            
        
                65
            
            
                Kudos Received
            
        
                22
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 15704 | 02-20-2018 12:33 PM | |
| 2045 | 02-19-2018 05:12 AM | |
| 2380 | 12-28-2017 06:13 AM | |
| 7925 | 09-28-2017 09:25 AM | |
| 13508 | 09-25-2017 11:19 AM | 
			
    
	
		
		
		12-19-2017
	
		
		11:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have ORC table in hive. Im using sparkSQL to query the hive ORC table in spark. Table is partition and I have two partitions, in which one partition has data and other partition doesn't have any data. I can understand and know that there is a bug existing in spark to handle zero byte file in hive table which is stored in ORC. But I just wanted to know is there any work around available to handle this issue. Spark version up-gradation is not a choice. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Spark
			
    
	
		
		
		12-15-2017
	
		
		05:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Ravi teja Based on my encounters, group by will be faster than distinct. Groupby is something similar to segregating the key, values which MR is capable of handling it with ease. I would say better to go with group by. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-16-2017
	
		
		11:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 What does hadoop fs -test do?  What are the other set of options which can be used along with -test like hadoop fs -test -d. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		11-13-2017
	
		
		06:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Chris Cotter. I understand for external table its not maintained by Hcatalog. But the folder mapping is done properly even im able to see the new partition value when I query the table. The only issue is the folder partition value is not changed which Im able to understand the reason. How come when I query the table I'm able to see the new partition value when its underlying folder value doesn't change?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-10-2017
	
		
		10:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks. I googled that already. Im looking for some detailed explanation. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-10-2017
	
		
		09:39 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Could someone explain about this parameter? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache Hive
			
    
	
		
		
		11-10-2017
	
		
		09:01 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have an external table created as TEXTFILE with partion on load_date. I have inserted data for one partitions say for example that particular hive table has partition (load_date='2017-11-09'). Now i wanted to rename the partition which I have did by using   ALTER TABLE tbl_name PARTITION (load_date='2017-11-09') RENAME TO PARTITION (load_date='2017-11-10');  After performing this operation, If i query the table Im able to see the new value for the partition. However the underlying HDFS still show the old partition path and its sub-directory still points to /hive/warehouse/default/tbl_name/load_date=2017-11-09.  Is this an known issue? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
			
    
	
		
		
		11-08-2017
	
		
		09:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @PremKumar Karunakaran   In spark you will not be able to modify the data. It's has immutable data which cannot be altered or modified. If you need to perform some modification in the DDL again that's not supported in spark, atleast as of now. You have to do it either through hive CLI but definitely not through spark.  Hope it helps!! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-07-2017
	
		
		12:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have a external table which is created with partitions and buckets(256). Now if have to reduce the no of partitions. As it is an external table I have drop and recreate without affecting the data. Now the underlying HDFS location will have 256 files created under each partition. Is there any way that I change the 256 files equal to no of buckets which I have used in the new DDL? I know I can achieve this by re-processing the data again. But just wanted to know If I can achieve this by enabling any property? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache Hive
 
        













