- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Job user name issue
- Labels:
-
Apache Ambari
-
Apache Hive
-
Apache Ranger
Created ‎06-10-2016 10:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have a 6 node Kerberized cluster(HDP 2.4.2). We are using Hive View for querying hive tables. when we executed queries, the jobs were submitted with logged in user name. However after I installed Ranger, the jobs are being submitted as user hive.
Are we missing any setup?
Thanks in advance.
Created ‎06-10-2016 10:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Antony,
By default, impersonation is enabled (hive.server2.enable.doAs is set to true) so the job appears to be running as the user submitting it.
However, when Ranger is enabled, this is turned off so the queries are submitted as the system user that the HiveServer2 process is running under (which is hive).
You can use either setting depending on how your users access the data.
If only HiveServer2 is used to access the data and all tables are managed by hive, then you can leave the impersonation turned off.
However if the hive data is being accessed and written from other tools (such as Pig or MR jobs), then you can turn the impersonation back on and also use Ranger to configure the correct HDFS permissions.
More about this and the best practice for each use case: http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/
Regards,
Alex
Created ‎06-10-2016 10:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Antony,
By default, impersonation is enabled (hive.server2.enable.doAs is set to true) so the job appears to be running as the user submitting it.
However, when Ranger is enabled, this is turned off so the queries are submitted as the system user that the HiveServer2 process is running under (which is hive).
You can use either setting depending on how your users access the data.
If only HiveServer2 is used to access the data and all tables are managed by hive, then you can leave the impersonation turned off.
However if the hive data is being accessed and written from other tools (such as Pig or MR jobs), then you can turn the impersonation back on and also use Ranger to configure the correct HDFS permissions.
More about this and the best practice for each use case: http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/
Regards,
Alex
Created ‎06-10-2016 12:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Alex
