Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

snapshots of tables by date

snapshots of tables by date

Expert Contributor

we are using hive with hdfs and often need to know what a table looked like on a particular date.

so, we have directories like these:

 

/user/abc/ReportSnaps/tableName/{date}

 

 

whenever we need to know what a table looked like on a particular date,

we change the location of table, map it to location for that date and get the data.

Now, we need to have reports like:

 

date | columnFromReportForTheDay

How do we achieve this? A lot of these tables we are taking snapshots for do not have a date column in them and are just truncated everyday and updated with new data.

 

 

my question is let's say there is a column percentage in this table.

I want to be able to have a final table:

 

date           percentage
20170808        50%
20170809        40%

from files that are stored like this:

  1. drwxr-xr-x - hue hue 0 2017-08-08 18:32 /user/hue/oostablecatalogue/20170808
  1. drwxr-xr-x - tarun hue 0 2017-08-09 10:22 /user/hue/oostablecatalogue/20170809