Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Permissions to the data stored in Hadoop

avatar
Rising Star

Hi,

I have a fundamental query on how permissions work in hadoop.

We are setting up a cluster with master nodes, data nodes and edge nodes. Edge nodes are the ones exposed to outside world and all hadoop clients are installed on these machines. External applications stage their data on edge nodes first and then load them into hadoop. We are implementing security to our clusters and thinking to have data ownership and permissions defined through Ranger policies to the app-usr for both HDFS and Hive data.

So if a application user app-usr is only given login access to edge nodes (through Active Directory groups), will the user be able to own any data in hadoop? For example, can I have a HDFS directory or Hive table that is owned by app-usr though the user is not available on the master nodes or data nodes but only on edge nodes. Will this allow me to configure Ranger policies for that user? Or should the user be able to login to all the nodes in the cluster?

Looking for ideas on the best strategy around this. Thanks

1 ACCEPTED SOLUTION

avatar
Master Guru
@bigdata.neophyte

I believe you need to integrate your Hadoop cluster to AD including Ranger usersync to define policies for app-user.

View solution in original post

4 REPLIES 4

avatar
Master Guru
@bigdata.neophyte

I believe you need to integrate your Hadoop cluster to AD including Ranger usersync to define policies for app-user.

avatar
Rising Star

@Kuldeep Kulkarni

Thanks for your response. Yes, cluster is integrated to AD and ranger-usersync is enabled. My question is around whether its needed to allow the app-usr to be able to login to master nodes and edge nodes vs just visible from these nodes. For security reasons, we wanted to disallow application users from logging into master nodes and data nodes.

avatar
Master Guru

@bigdata.neophyte - I think login access to the edge node is enough. Other nodes will have information about this user from AD so logically it should work.

avatar
Rising Star