I am setting up Ranger authorization on my Hadoop cluster (not Kerberos secured). I have Knox running already.
There is a pre-requisite for Ranger to configure LDAP/AD group level authorization. Please help me understand why it is important and needed.
Ranger is the component that enforces access policies to Hadoop resources in the cluster. Ranger must be configured to sync against local UNIX groups or against and AD/LDAP to have a list users and groups for which you will be creating policies. Ranger has no way of knowing which users and/or groups exist without that configuration.
When a user attempts to access a resource like Hive, Ranger will use the current user running the command (say hive from the command line) to see if policies allow or deny the requesting user access. A properly secure cluster would use Kerberos for non-repudiation.