Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kerberos, AD/LDAP and Ranger

avatar
Master Mentor

Bringing up couple of FAQ

1) Do we have to use Kerberos? We are ok with AD/LDAP authentication

2) Will Ranger work without Kerberos? Do we need Kerberos for Ranger to secure Ranger?

1 ACCEPTED SOLUTION

avatar
Rising Star

Since you brought up this blog, there are 3 things you need to know. 1. Authentication, 2. User/Group Mapping and 3. Authorization

1. For authentication, there is no alternative for Kerberos. Once your cluster is Kerberized, you can make it easier for certain access path by using AD/LDAP. Example, access to HS2 via AD/LDAP authentication or accessing various services using Knox.

2. Group mapping can be done in 3 ways. One as the blog says, where you lookup AD/LDAP to get the groups for the users. Second is to materialize the AD/LDAP users on the linux server using SSSD, Centrify, etc. Third is to manually create the users and groups in the linux env.All these options are applicable regardless whether you have Kerberos or not.

3. Authorization can be done via Ranger or using the natively supported ACL. Except Storm and Kafka, having Kerberos is not mandatory. Without reliable authentication, authorization and auditing is meaningless.

Common use case as yours: User A logs into the system with his AD credentials, HDFS or Hive ACL's kicks in for authorization.

You have to qualify "system". Which system are you logging in? Only HS2 and Knox allows you to login via AD/LDAP. If you are planning to do that, then you have to setup a very tight firewall around your Hadoop cluster. Absolutely no one should be able to connect to the NameNode, DataNode or any other service port from outside the cluster, except to the JDBC port of HS2 or Knox port. If you can setup this firewall, then all business users will be secure even if don't kerberize your cluster. However, any user who has shell login/port access to edge node/cluster or able to submit a custom job in the cluster will be able to impersonate anyone.

Setting up this firewall is not a trivial thing. Even if you do, there will be users who will need access to the cluster. There should be limited number of such users and these users should be trusted. And you should not let any un-approved job running within the cluster.

If the customer is okay with all the "ifs" and comfortable with limited number of super admin users, then yes you can have security without Keberos.

View solution in original post

16 REPLIES 16

avatar
  1. No you don't have to use Kerberos. You technically can go with AD/ LDAP authentication. You can do LDAP SSL Authentication also. However why won't you use Kerberos?
  2. Ranger will work without Kerberos

However your Storm plug in will not work. You need Kerberos for Storm plugin.

avatar

+ @bganesan@hortonworks.com the recommendation is always enable Kerberos/Ranger now. If someone is unwilling to do kerberos show them what happens when you set HADOOP_USER_NAME and I'm sure they will come running

avatar
Master Mentor

@Ali Bajwa @rgarcia@hortonworks.com

This is great discussions. Could you share an example of HADOOP_USER_NAME?

avatar

Wow did not know this. Thanks.

avatar

You need Kerberos if you're serious about security. AD/LDAP will cover only a fraction of components, many other systems will require Kerberos for identity. One can still keep users in the LDAP, but the first line in the infrastructure will be Kerberos. (examples: Storm, Kafka, Solr, Spark)

avatar
Master Mentor

@Andrew Grande Could you elaborate more on "fraction of components" ? User A logs into the system with his AD credentials, HDFS or Hive ACL's kicks in for authorization.

I agree with you that Kerberos add more security because all the benefits/features it comes with.

avatar

There is no security without kerberos. Before anyone goes down that road, just show them this first to make sure they are ok with it

# su yarn
$ whoami
yarn
$ hadoop fs -ls /tmp/hive
ls: Permission denied: user=yarn, access=READ_EXECUTE, inode="/tmp/hive":ambari-qa:hdfs:drwx-wx-wx
$ export HADOOP_USER_NAME=hdfs
$ hadoop fs -ls /tmp/hive
Found 3 items
drwx------   - ambari-qa hdfs          0 2015-11-04 13:31 /tmp/hive/ambari-qa
drwx------   - anonymous hdfs          0 2015-11-04 13:31 /tmp/hive/anonymous
drwx------   - hive      hdfs          0 2015-11-02 11:15 /tmp/hive/hive

avatar
Master Mentor
@Ali Bajwa

Thanks for sharing this.

Is it valid for AD logins?

User A logs into the system with his AD credentials, HDFS or Hive ACL's kicks in for authorization. Is it possible for user A to export HADOOP_USER_NAME=hdfs and take over permissions?

avatar

Yes I had tried this on cluster where NSLCD was setup so cluster recognizes LDAP users- would be the same for AD/SSSD

sh-4.1$ whoami
ali
sh-4.1$ hadoop fs -ls /tmp/hive/zeppelin
ls: Permission denied: user=ali, access=READ_EXECUTE, inode="/tmp/hive/zeppelin":zeppelin:hdfs:drwx------
sh-4.1$ export HADOOP_USER_NAME=hdfs
sh-4.1$ hadoop fs -ls /tmp/hive/zeppelin
Found 4 items
drwx------   - zeppelin hdfs          0 2015-09-26 17:51 /tmp/hive/zeppelin/037f5062-56ba-4efc-b438-6f349cab51e4