HiveServer2 impersonation must be turned off. HiveServer2 impersonation lets users execute queries and access HDFS files as the connected user rather than as the super user. Access policies are applied at the file level using the HDFS permissions specified in ACLs (access control lists). Enabling HiveServer2 impersonation bypasses Sentry from the end-to-end authorization process. Specifically, although Sentry enforces access control policies on tables and views within the Hive warehouse, it does not control access to the HDFS files that underlie the tables. This means that users without Sentry permissions to tables in the warehouse may nonetheless be able to bypass Sentry authorization checks and execute jobs and queries against tables in the warehouse as long as they have permissions on the HDFS files supporting the table.
the above text is from document, i just wonder why "Enabling HiveServer2 impersonation bypasses Sentry from the end-to-end authorization process" ? who can give some advises ? thanks.
Hi Eric ,
Thanks for the reply,
(1) In the resource pool, submission access control is set by "groupname", so when user from the group submitting a job through HUE, the Yarn is showing me the username as "hive" whom submitted the job, only upon the job is completed I could view who was the one submitted the job. Also if its a Spark job or Other huge jobs im unable to alert the user, before killing the job, which is very tough to monitor. So how to clearly see who submitted the Job?. when its showing hive everywhere.
(2) Hive databases is stored in /user/hive/warehouse/db*, but yet, users are creating tables as *external table* in thier own HDFS path /Project/Alpha/Table/..and in that path users are devided by *dev*sit*prd and etc. Besides just external tables, other files also stored at the same path, so are you suggesting me to leave the setting as hive:hive everywhere and let the "sentry role" to define who access what.?..
I have the same case where the queries are running as hive user. Is there anyway to detect which user run the query from Hue?
We have created a generic user which is shared among a group of analysts, but not able to detect who run the query. Is there any way to identify from Hue Sessions or IP address of the client who runs the query.