Member since 
    
	
		
		
		03-16-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                3
            
            
                Posts
            
        
                1
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2150 | 03-30-2017 05:23 AM | 
			
    
	
		
		
		04-01-2017
	
		
		04:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Constantin Stanca Any thoughts on this? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-30-2017
	
		
		05:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 1)Please
define actual size and performance numbers that you encountered.    Ans.   
   
   
    Data
  Volume  
   
   
    Time
  elapsed for TEZ  
   
   
    Average
  Time MR  
   
   
    Time
  elapsed for MR  
   
   
    Average
  Time for TEZ  
   
  
  
   
    1900 records  
   
   
    46.350 secs  
   
   
    41.626 secs  
   
   
    63.666 secs  
   
   
    56.176 secs  
   
  
  
   
    40.341 secs  
   
   
    55.633 secs  
   
  
  
   
    38.189 secs  
   
   
    49.230 secs  
   
  
  
   
    91914 records  
   
   
    32.049 secs  
   
   
    32.097 secs  
   
   
    52.920 secs  
   
   
    51.236 secs  
   
  
  
   
    32.088 secs  
   
   
    49.030 secs  
   
  
  
   
    32.156 secs  
   
   
    51.760 secs  
   
  
  
   
    993168 records  
   
   
    850.01 secs  
   
   
    861.781 secs  
   
   
    611.625 secs  
   
   
    635.781 secs  
   
  
  
   
    865.230 secs  
   
   
    691.751 secs  
   
  
  
   
    872.110 secs  
   
   
    672.285 secs  
   
  
  
   
    868.995 secs  
   
   
    567.466 secs  
   
  
   2)Clarify what test beds you are referring and how did you use
them?  Ans. In above statistics table:  In Operation 1 is a creating lateral view on a small data set.  In Operation 2 is joining 3 tables of intermediate data volume.  In Operation 3 is joining 4 tables of large data volume in inner
query and aggregation happening on top of that.   3)Clarify
what is the type of test case you execute? It is important to clarify because
some tests can be disk I/O intensive, others can be memory intensive.  
 1.Ans. Above jobs ran in parallel i.e. 10 jobs in parallel
on TEZ mode and 10 jobs in parallel on MR mode.    
 2.Above results are output of multiple test
iterations and performed on different test beds.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-16-2017
	
		
		08:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 We are doing some analysis on MR vs TEZ. TEZ is doing better than MR on small and mild data volumes but MR is beating TEZ on large volumes, We have seen it multiple times on different test beds. Please suggest 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hadoop
 - 
						
							
		
			Apache Tez