Created on 03-22-2017 02:53 PM - edited 09-16-2022 04:18 AM
Hi community,
a security related question: I got two clusters in my environment. Both are kerberized and connected to the same Active directory as KDC.
Let's look at technical users now: For the hive-user for instance I created a keytab both on the first and second cluster:
hive/somehostinclusterA@REALM.COM
hive/somehostinclusterB@REALM.COM
Now imagine that the first cluster (A) is run by development with moderate rules, while cluster B is run in production with strict rules.
When somebody now steals a keytab from cluster A, there is a security threat: Not only can he access cluster A, but he can also access cluster B (=production), which is bad.
Why? Because the auth_to_local rules just convert both hive/somehostinclusterA@REALM.COM and hive/somehostinclusterB@REALM.COM to hive, who is Superuser for the Hive Service in both clusters.
Is this a known security problem and are there guidelines on how to fix it. I thought about making complex auth_to_local rules, but this seems to be unoperatable.
Created 03-22-2017 09:42 PM
My thoughts from my experience
First, I would start with compromising on hive.keytab itself is a major security risk, which isn't typical or there should be restrictions put in place to prevent that. Second, we can chose to use seperate REALMS for the clusters in which case the rules will be specific to individual cluster.
Third,we can remove "DEFAULT" rule from auth_to_local and then manually code for needed principals. more details: https://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/
I am sure we can be more creative on auth_to_local for cluster specific rules and principals, but to simplify and still be secure would be to have seperate realms.
With the given scenario, that seems appropriate since they are part of the same kerberos realm with the default configurations.
The rules are necessary to find the hdfs directories , hdfs folder/file permissions or hive table owners and many others which use short name, I dont think ranger can help with that.
Let me know your thoughts
Cheers
Created 03-22-2017 03:13 PM
I believe that you are confusing authentication with authorization. Kerberos is only an authentication mechanism. It tells who the user is... not what the user can do. In some cases, the lack of who helps with authorization since there is no user to authorize. This is what you are trying to do by not translating certain principal names to local user names.
In the scenario you pose, there is a security issue; but, I am not sure that I would blame Kerberos or Ambari's configuration of the Kerberos infrastructure on it. I believe that by installing an authorization service, like Ranger, you should be able protect against unauthorized access to Hive and other services and thus rule out any cross-cluster access issues.
If you are looking to proceed with limiting access based on auth-to-local rules, be sure to see Auth-to-local Rules Syntax for information on the syntax of the rules.
Created 03-22-2017 09:32 PM
So after a conversation with @lmccay, it appears that my assumption/statement about Ranger is incorrect and therefore a compromised service principal compromises all (relevant) services on all clusters that use the same Kerberos realm. But once again, Kerberos is not an authorization mechanism... it is merely an authentication mechanism.
I think the only real solution here is to isolate clusters using different Kerberos realms. This can be done by using a local KDC and realm for the cluster-specific principals and creating a one-way trust with an Active Directory (or centralized KDC) for the user accounts. This has a few benefits over the centralized-only solution, including cluster isolation as well as network traffic isolation and distribution of load on the KDC.
Created 03-22-2017 03:20 PM
Thanks for your comment.
For any other users than technical users, I agree with you. Authorization is my friend to keep away users from the wrong cluster.
In this particular case, however, I am really "blaming" authentication. My problem is, that the technical users of two cluster have the same name and can therefore not be distinguished from another by an authorization engine, such as Ranger.
Example: The hive-Principal in cluster B SHOULD have access to all tables (as he is the superuser) but the hive-Principal from cluster B SHOULD NOT. Kerberos authentication, however, does not distinguish between the users, as they both have the same name.
Created 03-22-2017 09:02 PM
@Roland Simonis - I believe that @Robert Levas is generally correct but you are talking about a keytab being compromised. In that case, I believe it is generally gameover. Keytab management is extremely important. They should only be readable by root and not even backed up. If you want to protect clusters from keytabs that are compromised from other clusters then they should be for different realms - IMO.
Created 03-23-2017 06:27 AM
Thank you @Robert Levas and @lmccay for your insights. It appears, protecting the keytabs really is one of the most important Hadoop security tasks.
As seperating REALMs is not realizable in my case, I will stick with keeping keytabs secure and maybe tuning auth-to-local-rules.
Also thanks to @spotluri for summing it up.
Created 03-22-2017 09:42 PM
My thoughts from my experience
First, I would start with compromising on hive.keytab itself is a major security risk, which isn't typical or there should be restrictions put in place to prevent that. Second, we can chose to use seperate REALMS for the clusters in which case the rules will be specific to individual cluster.
Third,we can remove "DEFAULT" rule from auth_to_local and then manually code for needed principals. more details: https://hortonworks.com/blog/fine-tune-your-apache-hadoop-security-settings/
I am sure we can be more creative on auth_to_local for cluster specific rules and principals, but to simplify and still be secure would be to have seperate realms.
With the given scenario, that seems appropriate since they are part of the same kerberos realm with the default configurations.
The rules are necessary to find the hdfs directories , hdfs folder/file permissions or hive table owners and many others which use short name, I dont think ranger can help with that.
Let me know your thoughts
Cheers
Created 03-29-2017 12:59 PM
Just to sum it up: I have now chosen to place some regex in the auth-to-local rules to match exactly those hosts, which are used in a certain cluster.
While this adds operations overhead, it will make the cluster more secure.
The guys of Cloudera have a good summary about that in their documentation: https://www.cloudera.com/documentation/enterprise/5-9-x/topics/sg_auth_to_local_isolate.html
Created 03-24-2017 04:03 AM
Just had a new idea, that probably can solve the problem
We can have different account names such as hive-clusterA/hostname@realm.com
hive-clusterB/hostname@realm.com
And then have an auth_to_local rule in clusterA which converts hive-clusterA to hive and vice versa in clusterB.
Very similar to how "dn", "nn","nm" principals get resolved.
Cheers
Created 03-24-2017 07:37 AM
Hi @spotluri
This is also a great idea, if splitting REALMs is not feasible.