Member since
02-13-2020
2
Posts
0
Kudos Received
0
Solutions
02-13-2020
11:38 PM
The table-b has only one row with a value 1. My intention is to pass this value in TABLE-b.value, and subtract the number of days in the query. If I pass 1 then it should subtract one day from the TABLE-a.date_col. If I pass 2 then it should give me records that are 2 days lesser than TABLE-a.date_col. SELECT * FROM TABLE-a, TABLE-b WHERE 1=1 AND TABLE-a.date_col <= from_unixtime(unix_timestamp()-(TABLE-b.value*24*60*60)))
... View more
02-13-2020
05:17 AM
Hi,
I have 2 tables in Hive. Table-a has millions of rows and many columns. Table-b has just few rows with two columns (key as tablename, value as 1).
Now I do the following,
1) Cross join table a and b. 2) Based on table A's on of the date column, we need to pass the table b's column "value" as "1" or "2" and bring the table A's date column less than one or two days.
e.g.
select * from table-a, table-b where 1=1 and table-a.date_col <= from_unixtime(unix_timestamp()-(table-b.value*24*60*60))) - while the table-b.value is 1 then it subtracts one day or any number of hours.
Basically, we need to pass the table-b value as parameter according to our requirements. This cross join works for smaller tables. When the data of table-a is huge, it creates performance or memory issue.
Please advise on any solution to fine tune this approach or achieve the result using any alternate approach and overcome this issue in HIVE.
Thanks.
... View more
Labels:
- Labels:
-
Apache Hive