Support Questions
Find answers, ask questions, and share your expertise

Filtering my data when write a hive code

New Member

I have a hive table that has 463M records.  I am joining this table with a smaller one using the left join(left is the big table) When filtering using the where clause the it drops down to 4M which is wrong.  When I use the AND instead of where I get 463M which is wrong.  Is there a different way on doing it?  The correct answer should be 171M.

here is my code:

drop table if exists step_1;
CREATE TABLE STEP_1 AS
   SELECT t1.Sales_Account_ID_INT,
          t2.SALES_ACCOUNT_GROUP_NAME,
          t1.CR_Site_ID,
          t1.Technology,
          t1.VMS_TOP_CD,
          t1.Primary_SAV_Flag,
          t2.L1_SALES_TERRITORY_DESCR,
          t2.L2_SALES_TERRITORY_DESCR,
          t2.L3_SALES_TERRITORY_DESCR,
          t2.L4_SALES_TERRITORY_DESCR,
          t2.L5_SALES_TERRITORY_DESCR,
          t2.L6_SALES_TERRITORY_DESCR,
          t1.AdjWallet,
          cast(FY as int) AS FISCAL_YEAR,
         
            (CASE 
               WHEN  t1.Bookings< 0   or   t1.Bookings is null
               THEN 0
               ELSE t1.Bookings
            END) AS Bookings
      FROM UNIF_WALL_SALESVIEW_Q4FY16 t1
           LEFT JOIN SAV_ATTRIBUTES_Q4FY16 t2 ON (t1.Sales_Account_ID_INT = t2.SALES_ACCOUNT_ID_INT)
      (Here I tried the Where and AND)  t1.Primary_SAV_Flag <> 'N' AND t1.Sales_Account_ID_INT IS NOT NULL AND t1.FY IN
           (
           '2012',
           '2013',
           '2014',
           '2015',
           '2016'
           ) AND t1.Bookings <= t1.AdjWallet and t1.AdjWallet >=0
      ORDER BY t1.Sales_Account_ID_INT,
               Fiscal_Year;