Posts: 3
Registered: ‎08-17-2018
How to use Apache Sentry with Altus Data engineering clusters ?
[ Edited ]

I can see SDx namespace can be choosen while creating Altus DE cluster but not sure how to configure SDx and security policies through Sentry. 

Please help. 

View Entire Topic
Cloudera Employee
Posts: 1
Registered: ‎04-18-2018

Greetings SachB!


To help me better address your question, could you share a bit more about what you would like to accomplish with SDX and Sentry? Feel free to send details in a peronsal message if you prefer.


The engineering team is working on Sentry-related enhancements to SDX and we hope that they will answer your needs.



by SachB on ‎08-20-2018 05:10 AM
Hi Zolotov,

This is what I want to do -
Spawn Altus DE clusters for processing data on S3. I want to define tables (metadata) on data stored on S3 and also access policies on these tables (through Sentry).

I see SDx takes care of the metadata , governance and security In CDH latest versions. Is SDx is the solution for providing all this for Altus clusters ?
How can I define metadata , security , access policies for the common object storage and use it in my Altus DE clusters.

Thank you.
by Cloudera Employee skpabba on ‎08-20-2018 07:03 AM
Hi SachB,
Data Engineering (DE) clusters are single user clusters for running Hive/Spark/MR jobs. These jobs are typically adding new tables/partitions and not multi-tenant. For this reason, Altus doesn't configure Sentry on DE clusters.

Data Warehouse (DW) clusters are typically used for serving data and are multi-tenant. Altus configures Sentry on these clusters and you can define fine grained roles and permissions on tables.

In your use case, how are you planning to serve the data? Can you use a Altus DW cluster and use Sentry in that cluster to define roles and permissions?
by SachB on ‎08-20-2018 07:43 AM
Hi skpabba,

Can I use DE clusters to provide Hive/Spark environment to users in my enterprise to run their respective Hive/Spark jobs ?
I wanted to use the access policies to grant the users the required access to the tables from the DE cluster.

I plan to use DW cluster as well but some of my users want a native Hive/Spark environment as well.

by Cloudera Employee skpabba on ‎08-20-2018 11:24 AM
Yes, you can run existing Hive/Spark jobs in Altus.
Altus is for Cloud environments (AWS and Azure), so assumption is data is already in appropriate object store. You can run these Hive/Spark jobs to process the data from object stores.

Can you send me private message so we can talk over phone in more detail about your environment and how Altus can help?