Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How can I delete/truncate data older than 7 days from a hive table.

Rising Star
There is no date field in this table but the "input_file_name" contains date & time.

Ex: p_unicast.binary.ate.dk.20170117223832.947.dat

hive> desc mi_data;
OK
input_file_name  string
p_id             string
r_name           string
i_name           string
o_tag            string
pre_value        bigint
crnt_value       bigint
pre_out_value    bigint

2 REPLIES 2

Expert Contributor

you have to partition the table by date, then you can delete older partitions.

Expert Contributor

Is this an external table? if it is an external table then write a perl/shell script that runs daily based on a pattern it will remove files older than a certain date.

If it is not, and if it is a managed table with ORC/Parquet format, then it depends on how you load the table. While loading the table you could setup logic to populate the date in the table partition by that date and as @Avijeet Dash mentioned drop the partition that is 7 days old.