Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PIG is not restricting authorization of HCatalog database and tables

Solved Go to solution

PIG is not restricting authorization of HCatalog database and tables

New Contributor

Hi ,

We are verifying our product usecases over Ranger enabled HDP enviroment .

Our product launched from User A (LDAP user) . User A dont have access on any DB and Tables .

We have another User B (LDAP) . User B have access on marketingDb.saletable

When we logged in our product and use marketingDb.saletable and submit job so Job is getting success and Jobtracker is showing User A as user .

Question :- If job is launching from User A and User A dont have access on any HCatalog table so how job got successfully completed ?

To further debug this issue , we launched PIG job while User A keytab was in session so PIG job also successfully completed.

Could you please answer of these questions ... Is this happening due to any wrong configuration ..

Please guide us

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: PIG is not restricting authorization of HCatalog database and tables

Rising Star
@Piyush Jhawar

The Ranger Hive plugin protects Hive data when it is accessed via HiveServer2. When you access these tables using HCatalog in Pig you are not going through HiveServer2, but instead Pig is using the files directly from HDFS (HCatalog is just used to map the table metadata to the HDFS files in this case).

In order to protect this data, you should also define a Ranger HDFS policy to protect the underlying HDFS directory that is used to store the marketingDb.saletable data.

To clarify:

  • Ranger Hive Plugin - Used to protect Hive data when accessed via HiveServer2 (e.g, a user connecting to Hive via JDBC)
  • Ranger HDFS Plugin - Used to protect HDFS files and directories (suitable if users need to access the data outside of HiveServer2 - Pig, Spark etc)
4 REPLIES 4
Highlighted

Re: PIG is not restricting authorization of HCatalog database and tables

Rising Star
@Piyush Jhawar

The Ranger Hive plugin protects Hive data when it is accessed via HiveServer2. When you access these tables using HCatalog in Pig you are not going through HiveServer2, but instead Pig is using the files directly from HDFS (HCatalog is just used to map the table metadata to the HDFS files in this case).

In order to protect this data, you should also define a Ranger HDFS policy to protect the underlying HDFS directory that is used to store the marketingDb.saletable data.

To clarify:

  • Ranger Hive Plugin - Used to protect Hive data when accessed via HiveServer2 (e.g, a user connecting to Hive via JDBC)
  • Ranger HDFS Plugin - Used to protect HDFS files and directories (suitable if users need to access the data outside of HiveServer2 - Pig, Spark etc)

Re: PIG is not restricting authorization of HCatalog database and tables

New Contributor

@Laurence Da Luz

Thank you very much for detailed answer .My doubt have been cleared now

Re: PIG is not restricting authorization of HCatalog database and tables

@Piyush Jhawar

If @Laurence Da Luz answered your question, please accept the answer to help others in the community.

Re: PIG is not restricting authorization of HCatalog database and tables

New Contributor

Yes i should do it..:)

Done it now..Thanks..

Don't have an account?
Coming from Hortonworks? Activate your account here