Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

My table is a normal table. When insert overwrite, I found that I would put the old data under the HDFS directory into a folder such as base_0000003. Why not put the old data into the HDFS recycling station, which I can not understand.



Expert Contributor


If the table has TBLPROPERTIES("auto.purge"="true") the previous data of the table is not moved to Trash when INSERT OVERWRITE query is run against the table. This functionality is applicable only for managed tables and is turned off when "auto.purge" property is unset or set to false.

Related JIRAs HIVE-15880, HIVE-9118

Please accept the answer you found most useful.

Expert Contributor

From the directory listing, your table must have "transactional=true" property, i.e. it's an ACID table. That means that Insert Overwrite will create a base_x directory where it will put the result of the insert (new data) there. Any data that existed before, will remain in the table but will not be visible to readers that start after Insert Overwrite finished. Old data will be physically removed once Compaction runs over this table/partition.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.