Member since 
    
	
		
		
		05-29-2018
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                1
            
            
                Post
            
        
                0
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		06-13-2018
	
		
		12:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I did some experiment on hive. It looks like no matther how much I put on set block size, hive always gave the same result on parquet file sizes. There are a lot small files. Here are the table properties. Can anyone help me? Thanks in advance!     SET hive.exec.dynamic.partition.mode=nonstrict;  SET parquet.column.index.access=true;  SET hive.merge.mapredfiles=true;  SET hive.exec.compress.output=true;  SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;  SET mapred.output.compression.type=BLOCK;  SET parquet.compression=SNAPPY;  SET dfs.block.size=445644800;  SET parquet.block.size=445644800; 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Spark
 
        