Member since 
    
	
		
		
		05-23-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                28
            
            
                Posts
            
        
                10
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 31251 | 06-16-2017 12:14 PM | 
			
    
	
		
		
		06-19-2017
	
		
		07:20 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Sonny Heer So, we definitely need to use Sub-query in any case ( Group by or Windowing). And yes, Windowing is much faster than Group By, For the simple logic, Say you have 1 million rows, Group by will 1st Sort the data and then Group by the Key, whereas Windowing will just Sort and give you the 1st entry.  However, If your dataset is not large enough, you can live with Group by. It will hardly make any difference.  Can you please try and run both the queries (Windowing & Group by) and check a couple of things:  1. No. of Map task /Reduce tasks in both the queries.  2. If the Time Difference between 2 queries is more than 2 Mins, or it's almost the same. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-16-2017
	
		
		12:14 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Sonny Heer,  So what I understand from your query is you've got multiple tables say A,B,C,D,etc and your selecting a query joining on A left join B left join C , etc and there are Multiple entries in table B,C,D for the Key matching with A.  If this is the case, What I would suggest you is to use Windowing function.   Select A.a,B.b,C,c
from A left join
(Select * from 
( Select B.b,B.key,ROW_NUMBER() OVER (partition by key) AS row_num from B)
where row_num=1) B
on A.key = B.key
and so on..
  Try this out and let me know if it was helpful.  Cheers,  Sagar 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-16-2017
	
		
		08:07 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Simran Kaur  If you want to use this within the script, you can do the following.  set hivevar:DATE=current_date;
INSERT OVERWRITE DIRECTORY '/user/xyz/reports/oos_table_sales/${DATE}' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM outputs.oos_table_sale;  Cheers,  Sagar  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-12-2017
	
		
		09:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @rama  The below query will make your query run faster.  insert into table dropme_master_6 
select * from dropme_master_5 a
left outer join dropme_master_6 b 
on a.consumer_sequence_id = b.consumer_sequence_id
where b.consumer_sequence_id is null; 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-12-2017
	
		
		01:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @rama ,  Please change your query to this:  insert into table dropme_master_6 
select * from dropme_master_5 a
left outer join dropme_master_6 b 
on a.consumer_sequence_id = b.consumer_sequence_id
where b.consumer_sequence_id is null;
  I am pretty confident this will improve your performance.    Please let me know if it works and give me a thumps up. 🙂  Regards,  Sagar Morakhia 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-12-2017
	
		
		01:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Ravi Chinni  insert into some_test_table
select 'c1_val',named_struct('c2_a',array(cast (null as string)),'c2_c',cast (null as string)),array(named_struct('c3_a',cast (null as string) ,'c3_b',cast (null as string))) from z_dummy;
  This will work for you.  Needless to say, please upvote if the answer was useful. 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-11-2017
	
		
		08:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Can you provide your sample input entry.? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-10-2017
	
		
		09:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @rama , First of all, I would suggest you to kill such jobs after 2-2.5 Hours, especially when your job finishes in half n hour on a normal day.   1 probable cause could be any other job is utilizing 90+% CPU, hence slowing down your job process.   If you can provide me your entire query, I may be able to provide you few set parameters which will help running the query faster.  Cheers,
Sagar 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-10-2017
	
		
		09:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Perfect answer! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-25-2017
	
		
		12:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,   Nice Article. I found a faster way of doing the same from Official documentation of sqoop.   https://oozie.apache.org/docs/4.1.0/DG_WorkflowReRun.html  So generally, rerunning the Oozie jobs are ad-hoc tasks and you may not want to create xml file just for re-running the job.  So command line argument goes as below:  oozie job -oozie http://localhost:11000/oozie -rerun 14-20090525161321-oozie-joe -Doozie.wf.rerun.skip.nodes=<>
  Example for the same  oozie job -oozie http://localhost:11000/oozie -rerun 14-20090525161321-oozie-joe -Doozie.wf.rerun.skip.nodes=action1,action2,action3
  where   http://localhost:11000/oozie --> host where Oozie is running  14-20090525161321-oozi-joe --> is your Oozie Job name   action1,action2,action3 --> are the steps that you want to skip.  It is eventually doing the same thing as mentioned in the article, but with this, we don't have to create the config file.  Cheers,  Sagar  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		- « Previous
 - 
						
- 1
 - 2
 
 - Next »