Created 01-13-2020 09:05 PM
I am using HDP. The inode in the following code is a managed hive table.
# id zeppelin
uid=1017(zeppelin) gid=1003(hadoop) groups=1003(hadoop),1005(zeppelin)
# sudo -u zeppelin hadoop fs -ls /warehouse/tablespace/managed/hive/test1
ls: Permission denied: user=zeppelin, access=READ_EXECUTE, inode="/warehouse/tablespace/managed/hive/test1":hive:hadoop:drwxrwx---
The user zeppelin is in hadoop group, which has full permisison on the hdfs folder. So why do I get the permission error?
Created 01-14-2020 07:14 PM
Created 01-14-2020 11:43 PM
As the permission is with the zeppelin user [other] you will need to do that at a user level, remember fine-grained security ONLY give what is necessary !!
$ hdfs dfs -getfacl /warehouse/tablespace/managed/hive
# file: /warehouse/tablespace/managed/hive
# owner: hive
# group: hadoop
user::rwx
group::---
other::---
default:user::rwx
default:user:hive:rwx
default:group::---
default:mask::rwx
default:other::---
The command below will set [ r-x } bits to the correct ACL you can change to rwx if you wish
hdfs dfs -setfacl -R -m user:zeppelin:r-x /warehouse/tablespace/managed/hive
Thereafter the zeppelin user can
[zeppelin~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive
Found 3 items
drwxrwx---+ - hive hadoop 0 2018-12-12 23:42 /warehouse/tablespace/managed/hive/information_schema.db
drwxrwx---+ - hive hadoop 0 2018-12-12 23:41 /warehouse/tablespace/managed/hive/sys.db
drwxrwx---+ - hive hadoop 0 2020-01-15 00:20 /warehouse/tablespace/managed/hive/zepp.db
The earlier error is gone
ls: Permission denied: user=zeppelin, access=READ_EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------
Happy hadooping
Created 01-13-2020 10:17 PM
Created 01-14-2020 08:05 AM
# hdfs groups zeppelin
zeppelin : hadoop zeppelin
On the name node,
# id zeppelin
uid=1018(zeppelin) gid=1003(hadoop) groups=1003(hadoop),1005(zeppelin)
Created 01-14-2020 09:49 AM
I might have found the reason.
I ran the following command as hdfs, which is the superuser of hdfs.
$ hadoop fs -getfacl /warehouse/tablespace/managed/hive/test1
# file: /warehouse/tablespace/managed/hive/test1
# owner: hive
# group: hadoop
user::rwx
user:hive:rwx
group::---
mask::rwx
other::---
default:user::rwx
default:user:hive:rwx
default:group::---
default:mask::rwx
default:other::---
The output, as I understand, shows that the group owner has no permission on the folder. My guess is that, HDP Hive uses ACL to limit direct access to files behind managed tables. HDP Hive tries to force accessing to managed tables only through Hive.
Created 01-14-2020 02:06 PM
Created 01-14-2020 05:09 PM
I tried the following command
# sudo -u hdfs hadoop fs -setfacl -m g::rx /warehouse/tablespace/managed/hive/test1
But I got the error
-setfacl: Invalid type of acl in <aclSpec> :g::rx
The acl spec is to modify the owning group permission to rx.
Any suggestion?
Created 01-14-2020 07:14 PM
Created 01-14-2020 11:43 PM
As the permission is with the zeppelin user [other] you will need to do that at a user level, remember fine-grained security ONLY give what is necessary !!
$ hdfs dfs -getfacl /warehouse/tablespace/managed/hive
# file: /warehouse/tablespace/managed/hive
# owner: hive
# group: hadoop
user::rwx
group::---
other::---
default:user::rwx
default:user:hive:rwx
default:group::---
default:mask::rwx
default:other::---
The command below will set [ r-x } bits to the correct ACL you can change to rwx if you wish
hdfs dfs -setfacl -R -m user:zeppelin:r-x /warehouse/tablespace/managed/hive
Thereafter the zeppelin user can
[zeppelin~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive
Found 3 items
drwxrwx---+ - hive hadoop 0 2018-12-12 23:42 /warehouse/tablespace/managed/hive/information_schema.db
drwxrwx---+ - hive hadoop 0 2018-12-12 23:41 /warehouse/tablespace/managed/hive/sys.db
drwxrwx---+ - hive hadoop 0 2020-01-15 00:20 /warehouse/tablespace/managed/hive/zepp.db
The earlier error is gone
ls: Permission denied: user=zeppelin, access=READ_EXECUTE, inode="/warehouse/tablespace/managed/hive":hive:hadoop:drwx------
Happy hadooping
Created 01-15-2020 10:02 AM
@Shelton @EricL Thank you both.
the correct ACL spec is group::r-x
Now the following command works.
sudo -u zeppelin hadoop fs -ls /warehouse/tablespace/managed/hive/test1
From what I just ran into, I feel that, by design, Hive takes extra effort to prevent users from accessing managed table files directly. I will follow that design and access Hive managed table only through Hive.