I am upgrading our cluster to HDP 3.0.1. I have installed ranger (1.1.0) and enabled the hbase plugin. (Other plugins are working as expected - hdfs, knox etc)
All requests to hbase with _any_ user are being run as the root user. This is true through either knox or a direct call to hbase using curl. I do not have kerberos enabled as yet.
Has anyone seen this before?
Without enabling Kerberos authentication for HBase, any authorization checks you make are pointless.
When you don't have Kerberos authentication enabled, there is no guarantee that the end user is who they say they are. This makes authorization pointless.
I would focus on getting strong authentication setup before looking more into authorization.
That is specious on many counts:
- users may have been pre-authenticated. Worse, they could be external partner users having been pre-authenticated in the 'modern' world with OIDC / OAuth. There is no justification (even pleading insanity) to have such users Krb authenticated. In our case, there are 70K+ such users.
- the impersonation documentation here that cites the famous user 'joe' states:
"The superuser has kerberos credentials but user joe doesn't have any"
I thought that is the entire point of proxying / impersonation.
In our case, the end user is known, pre-authenticated and if they were not, they can't invoke our HBase query services which are protected behind an API gateway that challenges for a token (jwt).
while broadly agreeing with the principle of what you are saying, I would amend your earlier comment:
Kerberos authentication for HBase, any authorization checks you make are pointless."
Knox, in fact offers a HeaderPreAuth Provider for pre-authenticated use cases. It is another matter that it is half baked as far as user groups are concerned.
Without belabouring the point any further, impersonation is for users who are not authenticated and so need to piggy back on an authenticated super user. The permissions and ACL still have to be granted to such pre-authenticated users for the resources that they need. It is not the permission and ACL of the super user that is (or should be) utilized for authorization checks for the real user.
If the perimeter security provides strong authentication, there should not be a need to further authenticate the same user and too via Krb. Many a resources time is wasted by documentation and commentary recommending this route.