Community Articles

Find and share helpful community-sourced technical articles.
avatar
Expert Contributor

Context

As of CDP 7.1.2, Sentry is deprecated with Kudu, and Ranger becomes the solution for Fine-Grained Authorization in Kudu. The question this article is meant to address is what does it mean for the Impala/Kudu stack to enable Ranger for Kudu, especially if Impala is already using Ranger? This is not obvious up front because the information is spread over three different sets of documents.

 

Fundamentally the answer is nothing. The reason for this is straightforward: enabling the Kudu module in Ranger should automatically configure the --trusted_user_acl flag in Kudu to include Impala (if installed) and Hive (if installed). What this flag does is exempt the listed users from being checked against Kudu's authorization model, which in CDP installations mean Ranger policies. So, the correct way to enforce access to Impala and Hive tables, which are stored as Kudu, is the Hadoop-SQL policies set for Impala and Hive.

 

By default, enabling Ranger for Kudu should have no impact on your Impala or Hive operations; any further changes you want to make to authorization should be done in the Hadoop-SQL policies.

 

What is the motivation for using Ranger with Kudu, if that is the case? These tables are all still accessible at the Kudu level by users, and changes made from the Kudu level can cause inconsistencies or data loss. Enabling Ranger at the Kudu level prevents this.

 

There are two other use-cases where policies are set at the Kudu level in cm_kudu: these are Spark and NiFi. These services are on a by-user basis instead of having a service user communicate with Kudu; so the completely normal logic of writing Ranger policies applies. Please see the documentation of the respective service for more information.

436 Views
0 Kudos