Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

User authentication in Hadoop with Local and AD/Openldap authentication

Highlighted

User authentication in Hadoop with Local and AD/Openldap authentication

Explorer

Need help to understand few things listed below with user authentication impacts on hadoop cluster and its services.

1. Local Authentication:

a. App user only have access on Edge node and its not present on rest nodes of cluster, Will User able to run any job and hdfs commands? Does app user able to run hadoop fs -ls / command?

If not why?

 

2. AD authentication:

a. App user only have access on Edge node and ssh access not provided on rest nodes of hadoop cluster, All Hadoop cluster nodes integrated with Windows AD. Will App user able to run jobs and hdfs commands?

If not why? what the solution for it, i need App users only to access Edge Node?

 

3. Kerberos Authentication:

a. App user only have access on Edge node and ssh access not provided on rest nodes of hadoop cluster, All Hadoop cluster nodes integrated with Windows AD. Will App user able to run jobs and hdfs commands?

If not why? what the solution for it, i need App users only to access Edge Node?

- Vijay Mishra

3 REPLIES 3
Highlighted

Re: User authentication in Hadoop with Local and AD/Openldap authentication

Super Guru

@VijayM,

 

I'll do my best to provide some answers to your general questions:

 

(1)

Local

 

I am not sure what you mean by "local authentication" here so I can't answer the question about running commands against HDFS.

 

Users acting on hadoop must have OS-resolvable users in order to run YARN jobs.  So if the users do not exist on the cluster nodes, they cannot run jobs as themselves.

 

(2)

AD

 

I am not sure what sort of AD authentication you are specifying here.

What is using AD to authenticate?

 

Same as above... users must exist at OS level in order for users to run jobs directly.

 

(3)

Kerberos

 

If the hadoop cluster has Kerberos enabled, then any user who can kinit will be able to run "hdfs dfs -ls ..." commans.  Those users are subject to permissions and ACLs configured on HDFS

 

The users cannot run jobs unless they have an OS user as well.

 

You might need to describe, in more detail, what sort of integration you are talking about in your 3 scenarios.  It is ambiguous what you mean by integration in general.

 

In general, a common way to do what you are looking to do is to use SSSD, IPA, Centrify, etc. to allow OS user/group to integrate with your Active Directory (where I assume your users have accounts).  With Active Directory, domain users will also have a user principal name (kerberos).  Users can access the edge node, kinit and then act on the cluster as themselves.  The hadoop nodes are integrated with AD so that user/group OS requests are directed to AD so, when the user runs a job, their user names can be resolved.

 

I hope other community members might comment as well...

 

 

 

-Ben

Highlighted

Re: User authentication in Hadoop with Local and AD/Openldap authentication

Explorer

@bgooley

 

Here below find more explanation.

 

1. Local Authentication means Users locally created and exist on Hadoop Nodes, only Application users created on Gateway nodes, they not present on hadoop core nodes like namenode, datanode,etc.

 

question is:

 

1. If Application users only exist on gateway node so when they run  hdfs dfs -mkdir /test, will it work?

2. Does Application users able to run any job?

3. How authentication works as when data getting loaded on hadoop nodes, data can written anywhere on any datanodes, as Application users only present on Gateway node, will the write operation by Application user will start and execute?

 

 

2. Windows AD authentication:  All Hadoop Nodes Users authentication happening through Windows AD.

 

Question is same as above in case of windows AD authentication.

 

3. Kerberos Authetication:   All Hadoop Nodes Users authenticated by Windows AD to login on nodes and CDH cluster integrated with kerberos.

 

Questions is same as above in case of Kerberos authentication?

 

- Vijay Mishra

 

 

Highlighted

Re: User authentication in Hadoop with Local and AD/Openldap authentication

Super Guru

@VijayM,

 

Thanks... I'm still trying to understand everything you are saying.  Perhaps you could provide an example of what you are proposing to clarify.  I don't know what you mean by an "application user".

 

Also, what do you mean by "authentication is happening through Windows AD?"  Are you talking about Kerberos?  Are you talking about authentication at the OS shell or in the hadoop cluster?

 

The basic rules are:

 

- If a user acts on HDFS, the NameNode does not verify the user exists at the OS level.  Therefore user actions are goverend by privilege defined in the hadoop file system.  That means you can kinit as any user on your edge node and act on HDFS as that user.  Your HDFS must be configured to allow the user to be able to write if you want them to be able to write a file.

 

- If you run a YARN job, the user running the job must exist as an OS user.

 

Bottom line is that for HDFS and YARN, authentication by a user is via Kerberos.

 

If you have a cluster, I would recommend you test some scenarios and then post here if you hit anything that is confusing or seems wrong.

 

Don't have an account?
Coming from Hortonworks? Activate your account here