- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Problem renaming Hive partitioned table and then dropping
- Labels:
-
Apache Hive
Created ‎10-11-2024 06:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I renamed a Hive partitioned table DB1.Tab1 to DB1.Tab1_old and then moved another table with the same name but from another database DB2.Tab1 to the first database DB1.Tab1, using the same rename command. It looks like the partitioned directories were merged because when I deleted the first renamed one (DB1.Tab1_old), the moved table partitions (DB1.Tab1) were deleted. I would like to understand why this happened. Regards
Created ‎10-16-2024 12:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Patriciabqc
It seems this was fixed in CDP-7.1.8 according to the TSB, please see the Knowledge base link below.
Created ‎10-11-2024 08:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you able to reproduce at will?
If so, provide the Cloudera Runtime version and the associated DDL/DML
Created ‎10-11-2024 08:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Patriciabqc
The behavior depends on what was the table type and it's table properties. The CDP version may be also important as there were some fixes around the rename functionality (whether HDFS location is changed) between the versions.
The behavior is different for managed and external tables. If you rename a managed table, it should rename the HDFS location too, so you should not end up in such situations. Because of that I assume these were external tables.
The table property 'TRANSLATED_TO_EXTERNAL'='TRUE' should make an external table behave like a managed table ("legacy CDH managed table" behavior), so when you rename the table the location is also changed, so probably we can rule out this too, but I mention this for clarity that this is also an option - and maybe a way to avoid this in the future.
Last but not least, for external tables if the table property 'external.table.purge'='true' is there, then it will delete the HDFS location when the table is dropped. The symptoms suggest this was the case.
I hope this explains.
Best regards, Miklos
Created ‎10-11-2024 11:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We use CDP 7.1.7. The table is 'external.table.purge'='true'. After the renames, when I dropped the table that was renamed to '_old' it affected the partitions of the one that was moved to the database and renamed to the same name as the other (before being renamed). As if they were both accessing the same directory.
Created ‎10-14-2024 12:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, with renaming external tables Hive does not change their locations (to the new desired path, irrespective whether the DB changes or the table name), so once you drop the renamed table, due to the 'external.table.purge'='true' it dropped the original location's data directory.
FYI there were some fixes around the rename functionality, HIVE-24920 fixed in 7.1.7 SP1 and HIVE-26158 fixed in 7.1.7 SP2, not sure if that affects you but it's recommended to apply the latest service pack.
Created ‎10-16-2024 06:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Right, thanks for verifying that the partition locations are not renamed. Then it is clearly the HIVE-26158, which has been fixed in 7.1.7 SP2, please plan/test and apply the 7.1.7 SP2 or the even newer 7.1.7 SP3 version (recommended) to avoid this in the future. (one should not need to alter the partition locations when table is renamed)
Best regards
Miklos
Created on ‎10-16-2024 11:06 AM - edited ‎10-16-2024 11:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-16-2024 11:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it seems to be this issue, but we already have 7.1.7 SP2 installed. Maybe SP3 fixed this issue... Thanks!
Created on ‎10-11-2024 11:02 AM - edited ‎10-11-2024 11:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-15-2024 06:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I renamed the table, Hive renamed its HDFS directory, but it seemed that in the metastore database the partitions directories were not renamed. Maybe I needed to run some command to change the partitions locations in the metastore after renaming the table...
