Hi Folks out there,
Currently I have a scenario where i have to get only latest record per id from hive table based on timestamp.
I am looking for best approach to do it. My data is in hive internal table with parquet files.Similar to this
With the help of analytical functions (row number) you can get latest updated values from the hive table
Thanks for your answer, As i mentioned there are multiple ways to do it however i am looking for best ways to do on performance stand point.