08-09-2017 05:05 AM - edited 08-09-2017 05:55 AM
we are using hive with hdfs and often need to know what a table looked like on a particular date.
so, we have directories like these:
whenever we need to know what a table looked like on a particular date,
we change the location of table, map it to location for that date and get the data.
Now, we need to have reports like:
date | columnFromReportForTheDay
How do we achieve this? A lot of these tables we are taking snapshots for do not have a date column in them and are just truncated everyday and updated with new data.
my question is let's say there is a column percentage in this table.
I want to be able to have a final table:
date percentage 20170808 50% 20170809 40%
from files that are stored like this: