Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to append the preceding rows for the same key in hive for different dates in hive

How to append the preceding rows for the same key in hive for different dates in hive

Explorer

I have a scenario like below. can anyone please provide some info.

I have the below table. i tried doing this using join with lag and lead function but its not giving the expected result. I'm just not getting any clue how to implement the solution for this.

name date amount
abc 04/06/2018 100
abc 04/06/2018 200
abc 04/13/2018 300
now the output i needed is below.

name date amount
abc 04/06/2018 100
abc 04/06/2018 200
abc 04/13/2018 100
abc 04/13/2018 200
abc 04/13/2018 300
so here 300 is the new value for 04/13/2018 and 100,200 from 04/06/2018 will also shown for 04/13/2018. Do we have any way to do this in hive. Any help will be greatly appreciated.

4 REPLIES 4
Highlighted

Re: How to append the preceding rows for the same key in hive for different dates in hive

Hi @johny gate

Below query works but its kind of dirty. Hope it Helps!

Select a.*,tblb.col3 from a  left join  (select*,lag(col3)over (partition by col1 order by col2)as lag_val from a) tblb  on tbl b.col1=a.col1 and a.col2=tblb.lag_val
Highlighted

Re: How to append the preceding rows for the same key in hive for different dates in hive

Explorer

Hi @Bala Vignesh N V

Thanks for your inputs. can you please explain what are these cols... col1,col2,col3.. are you representing name,date and amount as col1,col2,col3 here. Also will this work for all the next dates because for any next date all the amount value from prior dates should present as one row for one amount for the next date.

Highlighted

Re: How to append the preceding rows for the same key in hive for different dates in hive

@johny gate

yes col1 are column names. in your case name,date & amount. Yes it should work for the scenario.

Highlighted

Re: How to append the preceding rows for the same key in hive for different dates in hive

Explorer

Its still not giving me the result as expected. It has to carry over the data from the last dates to the next date with one row per amount. Also your join condition a.col2=tblb.lag_val here col2 is date col and lag_val is calculated on amount col how this join condition will match a.date = tblb.lag_val(calculated on amount)

Don't have an account?
Coming from Hortonworks? Activate your account here