Created 09-29-2015 04:44 PM
Can anyone help converting this Teradata SQL to something that will run in Hive? Currently it fails with the error "SemanticException Line 0:-1 Both left and right aliases encountered in JOIN 'Week_End'" and my understanding of SQL is too limited to understand how to refactor the query.
(SELECT S.Calendar_Date Calendar_Date ,S.Foll_Monday Foll_Monday ,W.Week_Id Snapshot_Id FROM ( SELECT calendar_date ,CASE WHEN day_of_week = 1 THEN calendar_date + 1 WHEN day_of_week = 3 THEN calendar_date + 6 WHEN day_of_week = 4 THEN calendar_date + 5 WHEN day_of_week = 5 THEN calendar_date + 4 WHEN day_of_week = 6 THEN calendar_date + 3 WHEN day_of_week = 7 THEN calendar_date + 2 ELSE calendar_date END AS Foll_Monday FROM td_etl.calendar ) S LEFT JOIN td_etl.dim_cal W ON S.Foll_Monday BETWEEN W.Week_Start AND W.Week_End) FM
Created 09-29-2015 05:48 PM
Without knowing your schema, here is a rough take:
(SELECT S.Calendar_Date Calendar_Date ,S.Foll_Monday Foll_Monday ,W.Week_Id Snapshot_Id FROM ( SELECT calendar_date, ( CASE day_of_week WHEN 1 THEN calendar_date + 1 WHEN 3 THEN calendar_date + 6 WHEN 4 THEN calendar_date + 5 WHEN 5 THEN calendar_date + 4 WHEN 6 THEN calendar_date + 3 WHEN 7 THEN calendar_date + 2 ELSE calendar_date END ) Foll_Monday FROM td_etl.calendar ) S LEFT JOIN td_etl.dim_cal W ON S.Foll_Monday BETWEEN W.Week_Start AND W.Week_End) FM
Created 09-29-2015 05:48 PM
Without knowing your schema, here is a rough take:
(SELECT S.Calendar_Date Calendar_Date ,S.Foll_Monday Foll_Monday ,W.Week_Id Snapshot_Id FROM ( SELECT calendar_date, ( CASE day_of_week WHEN 1 THEN calendar_date + 1 WHEN 3 THEN calendar_date + 6 WHEN 4 THEN calendar_date + 5 WHEN 5 THEN calendar_date + 4 WHEN 6 THEN calendar_date + 3 WHEN 7 THEN calendar_date + 2 ELSE calendar_date END ) Foll_Monday FROM td_etl.calendar ) S LEFT JOIN td_etl.dim_cal W ON S.Foll_Monday BETWEEN W.Week_Start AND W.Week_End) FM
Created 09-30-2015 09:48 AM
I don't think this left join with a non equi condition will work with Hive.
Check the documentation I wrote for a workaround:
https://wiki.hortonworks.com/display/IK/Hive+non+equi+left+join
Created 09-30-2015 11:31 AM
agree with Sourygna. Deepesh, did you try this.? I dont think hive currently supports non equi joins.
Sourygna workaround should work here.
Created 09-30-2015 03:15 PM
No, I did not test this. I basically just converted the case statement. Sourygna's point is right, the non-equi join will have to be transformed in order for this to work in Hive.
Created 10-01-2015 05:18 PM
In this case, if we look at the <intent> of the query it is possible that it can be rewritten to not need a non-equi-join.
If you can make this Assumption about the Date Dimension:
A Date Value in the date dimension (such as 'Foll_Monday' in the query) will always be between the Week_Start and the Week_End values.
Then you should be able to rewrite the query to not use a non-equi join. The second table in the join would become:
(SELECT S.Calendar_DateCalendar_Date ,S.Foll_MondayFoll_Monday ,W.Week_Id, Snapshot_Id FROM ( SELECT calendar_date, ( CASE day_of_week WHEN 1 THEN calendar_date +1 WHEN 3 THEN calendar_date +6 WHEN 4 THEN calendar_date +5 WHEN 5 THEN calendar_date +4 WHEN 6 THEN calendar_date +3 WHEN 7 THEN calendar_date +2 ELSE calendar_date END ) Foll_Monday FROM td_etl.calendar) S inner join ( select <NaturalDateValue>, Week_Id, week_Start, Week_End from td_etl.dim_cal ) W on S.foll_monday = W.<NaturalDateValue>
Give that a try.