I am exploring Kudu - Impala interaction and I can't find a good way to secure kudu table from impala.
Let's say, I have Kudu table "test" created from CLI. Then, I login into impala-shell with user "u1", which have an access to database "db1" and create table in this database like this:
CREATE EXTERNAL TABLE db1.test_kudu1 STORED AS kudu TBLPROPERTIES ('kudu.table_name' = 'test');
I logout from "u1" user and login as "u2" user which does not have access to database db1, but do have access to database "db2" and this user create table in the "db2" database same way "u1" created table db1.test_kudu1:
CREATE EXTERNAL TABLE db2.test_kudu2 STORED AS kudu TBLPROPERTIES ('kudu.table_name' = 'test');
Now, both users have access to kudu table "test", they both can modify this table, delete and insert data. Is there any way to avoid such insecurity with kudu tables?
It seems like if we gain access to kudu tables to impala service user, any impala user that have access to creation tables in any database can create table above kudu table and then modify it. Is it really true?
https://docs.cloudera.com/runtime/7.2.2/administering-kudu/topics/kudu-security-trusted-users.html discusses the issue. In summary, Hive/Impala tables (i.e. those with entries in the Hive Metastore) are authorized in the same way, regardless of whether backing storage is HDFS, S3, Kudu, HBase, etc - the SQL interface does the authorization to confirm that the end user has access to the table, columns, etc, then the service accesses the storage as the privileged user (Impala in this case).
In this model, if you create an external Kudu table in Impala and give permissions to a user to access the table via Impala, then they will have permissions to access the data in the underlying Kudu table.
The thing that closes the loophole here is that creating the external Kudu table requires very high privileges - ALL permission on SERVER - a regular user can't create an external Kudu table pointed at an arbitrary Kudu cluster or table.