I have a question about authorization for the Hive Metastore (not the HiveServer2). Cluster is HDP 2.5 and Kerberos is set up.
The Apache community recommends to use a StorageBasedAuthorizationProvider. I understand, how it gets the ACLs from the underlying filesystem.
In my situation, I have Ranger set up and want to handle most of authorization there - effectively making Hadoop native permissions unused (for instance by setting the to 000 on the Hive directories).
The question now is:
- When using the StorageBasedAuthorizationProvider: Will the Hive Metastore consider Ranger policies on HDFS warehouse directories in his decision, if a certain user can read/write to directory? Or do I have to use POSIX permissions or HDFS ACLs?
- Is the a better way to realize Hive Metastore authorization (Maybe a custom authorization provider for HiveMetastore, that connects to Ranger and uses Ranger Policies for HiveServer2)?
There is nothing specific to Hive Metastore in evaluating access to HDFS resources. If HDFS Ranger plugin is enabled, then Ranger policies in conjunction with HDFS ACLs will apply. If HDFS Ranger plugin is not enabled, only HDFS ACLs will apply.