04-23-2018 03:02 PM
Kudu 1.5.0 has been installed on our cluster currently running CDH 5.13.1. After reading that Kudu authorization is coarse-grained, and allows users that have access to Kudu full access to the data.
After testing, we've found that this is through the Kudu command-line interface where these coarse-grained ACLs get applied. We observed a non-admin user had access to drop tables in a database that this account had not been granted access to. This was via the following command:
kudu table delete <master_address> <table_name>
This is a security risk and concern for us when implementing Kudu in live environments where data security is critical.
Is there a way to limit access to the Kudu command-line interface to admin groups, and is this a suggested method?
04-23-2018 03:17 PM
After testing, we've found that this is through the Kudu command-line interface where these coarse-grained ACLs get applied.
The ACLs are enforced by th server, not the kudu CLI tool itself.
We observed a non-admin user had access to drop tables in a database that this account had not been granted access to.
Right, as discussed in the docs you linked, user principals have 'the ability to create, drop, and alter tables, as well as read, insert, update, and delete data'. If you would like to restrict a user from being able to drop a table, you will need to remove them from the user ACL whitelist. Note that this will also disallow them from all other interactions with Kudu data. A common configuration is to only include the 'impala' user in the user ACL whitelist, in which case only users authenticated through Impala will be able to use Kudu. Impala will apply its own Sentry-based authorizations which can distinguish privileges at a finer level.
04-24-2018 11:26 AM
So to limit access exclusively through Impala, we would add the impala user to the Kudu User Access Control List without adding individual IDs who would interact with Kudu via Impala? This would prevent these same user IDs from being able to delete or view data from the Kudu CLI, since only 'impala' is allowed access? Would this also prevent users from accessing from Kudu APIs, including through Spark?
04-24-2018 11:32 AM
The ACL is a whitelist, so you can add whichever users you like. The default value is '*', which allows any authenticated user to be authorized, but if you want to limit it to joe, bob, and impala you can set it to 'joe,bob,impala'. In this case joe and bob can take whatever actions they like directly on Kudu (including through Spark) and all other users going through Impala will be subject to Impala's more fine-grained authorizations.
The Kudu CLI is subject to the exact same authorization checks as any other application interacting with Kudu, including direct client access, Spark and Impala. To Kudu, it's 'just another' client.
04-24-2018 12:36 PM
Okay this makes sense I think, but what I meant is that users that access through Impala will be subject to Impala's more fine-grained permissions from Sentry. On the other hand, users that access using another method such as Kudu CLI will not be since it accesses Kudu directly, which is not directly integrated with Sentry, thus not picking up Hive or HDFS ACLs applied there.
04-25-2018 01:05 PM
Yep, that's correct. That's the limitation implied by 'coarse-grained authorization'. Applying Sentry's fine-grained authorization policies in the Kudu server is a long-term roadmap item.