Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ranger implementation - hive security access recommendation

avatar
Contributor

Current scenario (no Ranger):

Hive security is relying on HDFS file permissions (storage based auth). In Hive warehouse, each database/table folder is owned by the respective user who created the database/table.

with Ranger how do we manage hive security:

Hive impersonation turned on:

In this case, I am having to provide access to underlying HDFS folders to the user who is requesting to access the hive table. It means that I have to create a HDFS policy and also a Hive policy to enable access to a user on a given Hive table.

It is difficult to create hdfs policies for each and every hive db & table. (thousands of tables) Is there a recomanded approach?

1 ACCEPTED SOLUTION

avatar
Rising Star
4 REPLIES 4

avatar
Rising Star

So what we have done for example is made some assumptions about who will access data and how. We break this down into 2 groups of users: Analysts, and Power Users.

Analysts (90% of users) ONLY access the data via Hive, they never go from HDFS or use any other tools. Analysts also need to have column level security in place to ensure they only access data related to thier clearance - ie public, pii, spii...

Power Users (10 % of users) can access the datasets with any tool from Hive or HDFS and have no restrictions on the columns they can see. The service application also counts as a poweruser as it deals with the ingest and preping of the data.

To facilitate this we did a few things

  • Hive owns all the data on HDFS
    • We no longer have to manage policies per user at an HDFS Level (due to the below)
  • Hive doAs=false (no impersonation)
    • This enables Ranger to decide if users can access tables or not
    • Jobs now run as Hive, with Hive owning the data there is no no issue
  • PowerUsers get Ranger HDFS Policies (think overloaded)
    • This lets them access anything from HDFS
    • We also Give Power Users Select * policies in Hive so they can query any column

Hope this helps

avatar
Super Collaborator

Joe, please dont' mention customer names as we dicussed. Please reword your response to remove it.

avatar
Rising Star

avatar
Contributor

Thanks @jniemiec@hortonworks.com and @bganesan@hortonworks.com for your quick response. I went though the blog, it was really helpful.

For new Ranger implementations:

> Is there a way to bring the access/permissions for all hive database/tables into ranger without manually creating them in Ranger?

> If there is no easy way to pull current access/permissions into ranger, does it require overhaul of the security architecture for hive databases/tables?