Support Questions
Find answers, ask questions, and share your expertise

Spark Sql Optimization

Explorer

Hi Team

 

I have the below teradata code where it processes trillions of records and i would ideally want to break this down to a more simpler query or number of queries that will allow the repetetive condition checks in case statements to be simplified and helps in performance.Any advise will be really helpful

Here if you can see the date - interval condition for a similar let of columns are repeated while the sum of the computations is carried out.What would be the best way to tune the query so that the aaggregations are done in an optimal manner

 

 

select a.cust_id,a.cust_dt,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('6') then c.some_value else 0 end) column 1,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('7') then c.some_value else 0 end) column 2,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '14' day) ) and c.ea_flg in ('8') then c.some_value else 0 end) column 3,






sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('5') then c.some_value else 0 end) column 4,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('34') then c.some_value else 0 end) column 5,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '7' day) ) and c.ea_flg in ('56') then c.some_value else 0 end) column 6,






sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('76') then c.some_value else 0 end) column 7,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('23') then c.some_value else 0 end) column 8,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '30' day) ) and c.ea_flg in ('54') then c.some_value else 0 end) column 9,










sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('78') then c.some_value else 0 end) column 10,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('99') then c.some_value else 0 end) column 11,


sum(case when a.cust_dt < b.cust_dt and a.cust_dt >= (b.cust_dt - interval '60' day) ) and c.ea_flg in ('21') then c.some_value else 0 end) column 12






from 


table_1 a,table_2 b,table_3 c

join conditions on a,b,c  
where a.cust_dt = sysdate -1





group by 1,2;

 

0 REPLIES 0