- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to properly delete a hive table through HUE ?
- Labels:
-
Apache Hive
-
Cloudera Hue
Created ‎08-06-2019 12:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I have a table at my sqoop schema and want to delete it, I go to HIVE editor (through HUE) and key in the following command DROP TABLE IF EXISTS schemaName.tblName PURGE;
After that the table disappeared form the gui of HUE (sqoop table list, metastore list) but the actual files of the table were not deleted from the HDFS.
From the other hand when I do the same thing at the impala editor everything works fine.
Why this is happened & how can I drop/remove a table from hive metastore and HDFS all at once?
Thanks in advance.
Created ‎08-06-2019 03:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi @eMazarakis
as far as I know Impala does not support HDFS Impersonation .
(security reasons I guess )
means you can't delete hdfs files using hue user
cheers !
Created ‎08-06-2019 06:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@BildervicI’m afraid it is not clear what I am saying. I have a table which is owned to impala user. When I am dropping the table through impala editor (Hue) with the above command the table & hdfs files were deleted.
But when I have a table which is owned to hive user and I am trying to delete it through hive editor (HUE) with the above command the table is disappeared from the gui but the files were not deleted.
And I am asking why is this happened ?
Created ‎08-06-2019 07:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
it looks like a permission issue which is silently ignored. Can you please post the ACL's from the HDFS path, the root folder where the table is stored and the ACL's of the table path as well.
Maybe you can check the hive server's log, if there is any kind of permission issue.
And finally I would check the NameNode logs, of the file is not deleted because of missing permissions, it will be there as a log message.
Tomas
Created on ‎08-07-2019 01:52 AM - edited ‎08-07-2019 01:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Tomas79 here is the ACL of the hdfs path of the dir/table.
# file: /user/hive/warehouse/sqoop.db/sr_tt
# owner: hive
# group: hive
# flags: --t
user::rwx
group::rwx
other::rwx
Now if I go from Hive editor (HUE) and execute the query DROP TABLE sr_tt PURGE;
the files were not deleted.
Also Do you know where can I found the logs of the NameNode and hive server's log?
Created ‎08-09-2019 01:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
NN log is under /var/log/hadoop-hdfs and HS2 under /var/log/hive by default. If you use Cloudera Manager, you can check in CM's configuration page for both of them.
To determine why the directory was not removed:
- check if the table is managed or external table, as external table will not delete HDFS files
- check the HMS server log, also under /var/log/hive on the HMS host, to check if there is any permission errors, as it is HMS's responsibility to create and delete HDFS paths
- please also confirm if you have Hive Impersonation enabled? (under CM > Hive > Configuration > "HiveServer2 Enable Impersonation" if you are using CM)
Cheers
Eric
Created ‎08-09-2019 02:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@eMazarakis I noticed the "-t" flag on the directory. That is a sticky bit, every time you see it in the HDFS ACLs, it tells you that only the owner is able to drop directory, nobody else (even if write access is granted)
So in your case only the hive user can remove this directory. Maybe as @EricL pointed out, you have impersonation, so the query is running under a different user.
In either way, you need to search for this permission issue in the logs.
Created on ‎08-09-2019 06:38 AM - edited ‎08-09-2019 06:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@EricL My findings are the following:
- The table is managed.
- Also the HiveServer2 Enable Impersonation is setting to TRUE. I do not understand what this option is doing. Can you explain me ?
- HiveMetastoreServer log I can't see anything about permissions.
Created ‎08-09-2019 07:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"HiveServer2 Enable Impersonation is setting to TRUE" is probably the reason. When Impersonation is true, it means Hive will impersonate as the end user who runs the query to submit jobs. Your ACL output showed that the directory is owned by "hive:hive" and as @Tomas79 found out, you have sticky bit set, so if hive needs to impersonate as the end user, the end user who runs the query will not be able to delete the path as he/she is not the owner. If impersonation is OFF, then HS2 will run query as "hive" user (the user that runs HS2 process), then you should not see such issue.
I assume you have no sentry? As sentry will require Impersonation to be OFF on HS2 side, so that all queries will be running under "hive" user.
To test the theory, try to remove the sticky bit on this path and drop again in Hive.
Cheers
Eric
