Reply
New Contributor
Posts: 2
Registered: ‎07-10-2017

Best way to secure an external table in a separate encryption zone.

I have a Spark job that deals with some sensitive data the output of which should be restricted to a small number of users. The intended output is that the Spark job writes the output to a text file in HDFS in a separate encryption zone. I would like to create an external table on top of this file for downstream consumption through Tableau.

 

However when I create the external table, the file is owned by group Hive, and thus all admins on the cluster can see and select from the table. As such:

 

(1) What is the appropriate group membership for this file? There is a specific group that corresponds to authorized viewers of this table.

 

(2) Is there a way to allow access through a tool like Tableau to view this table in tabular form that can circumvent group Hive from having access to the data?

 

Thanks,

Anthony

Posts: 342
Topics: 11
Kudos: 48
Solutions: 28
Registered: ‎09-02-2016

Re: Best way to secure an external table in a separate encryption zone.

@anthonyjgatti

 

You can refer the below link where i've mentioned the advantage and limitations of each security methods 

 

https://community.cloudera.com/t5/Security-Apache-Sentry/Hadoop-Security-for-beginners/m-p/48576#M17...

 

 

Highlighted
Posts: 563
Topics: 3
Kudos: 79
Solutions: 51
Registered: ‎08-16-2016

Re: Best way to secure an external table in a separate encryption zone.

Why are your Admins in the Hive group? Sentry requires user and group to be set to hive. I think this is assuming that only the user hive is in the hive group.

The authorization should be set in Sentry but yes under the covers the hive user needs to have control. The HDFS Synchronization plugin can then push Sentry privileges down to the HDFS ACLs.

In short, if you aren't already, I think you should have HDFS ACLs, Sentry, and the HDFS Synchronization plugin configured. Set the access controls in Sentry and use that to enforce access to all external tools.

The alternative is to manually synchronize HDFS ACLs.
New Contributor
Posts: 2
Registered: ‎07-10-2017

Re: Best way to secure an external table in a separate encryption zone.

Thank you. It looks to me like the HDFS ACL for this file is too expansive. I need to sync with administrators to understand the right ACL, but it's good to know that if I set the ACL right it won't matter that the hive group has access to the file.

Announcements