SELECT dim.Date1, * from fact join Dim on Dim.Date1 between fact.Begin_date and fact.End_date
I have this sql query and I need to convert it to hive query. I know that non equi joins does not work in hive and tried this but it is not working.
SELECT dim.Date1, * from Dim,fact where dim.date1 between fact.Begin_date and fact.End_date
As @Sunile Manjee also suggested, you could give cross-join a shot. Your query would look something like this:
SELECT dim.Date1, fact.* FROM fact join dim WHERE dim.Date1 >= fact.Begin_date and dim.Date1 <= fact.End_date;
This is a cartesian product and it could take serious resources. You need to understand the data in both tables and estimate some parameters that will help your query parallelism. Look here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ref-ffe....
Before 0.10 when cross-join was introduced, I used a different workaround. Here is an example. I did not write it to your case exactly. Feel free to adapt to your needs. You could also create temporary tables. _serial and _inner tables below will be out of your dim table which supposedly is small when compared with fact table. You get the point.
1. Create a copy of the left table with added serial row numbers:
CREATE TABLE OrderLineItem_serial AS SELECT ROW_NUMBER() OVER() AS serial, * FROM OrderLineItem;
2. Do an inner join:
CREATE TABLE OrderLineItem_inner AS SELECT * FROM OrderLineItem_serial li JOIN ProductPrice p on p.ProductID = li.ProductID WHERE OrderDate BETWEEN startDate AND EndDate;
3. Left join by serial:
SELECT * FROM OrderLineItem_serial li LEFT OUTER JOIN OrderLineItem_inner i on li.serial = i.serial;
If you find your solution in responses provided, please vote and accept the best answer.