Created 05-15-2018 03:57 PM
Hi Folks out there,
Currently I have a scenario where i have to get only latest record per id from hive table based on timestamp.
I am looking for best approach to do it. My data is in hive internal table with parquet files.Similar to this
Using spark+hive.
Thanks,
Created 05-15-2018 04:09 PM
With the help of analytical functions (row number) you can get latest updated values from the hive table
Created 05-16-2018 08:16 AM
Thanks for your answer, As i mentioned there are multiple ways to do it however i am looking for best ways to do on performance stand point.