Created 02-07-2017 05:28 PM
This is likely a bug.
On our HDP 2.5 clusters, we have:
hive.server2.enable.doAs=false
The HDFS folder /user/test1/csvfolder has read,write,execute permissions granted to the user hive by Ranger
While loading data into an internal Hive table using "load data inpath '/user/test1/csvfolder into table tablename', we get an error.
INFO : Loading data to table tablename from hdfs://sandbox.hortonworks.com:8020/user/test1/csvfolder ERROR : Failed with exception org.apache.hadoop.security.AccessControlException: Permission denied. user=hive is not the owner of inode=csvfile.csv at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:250) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:227) at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermission(RangerHd fsAuthorizer.java:307)
This error occurs both using Beeline as well as Hue's hive client.
Here's why I think the error occurs. When doing a 'load data in path', it appears that hive copies the data files from hdfs into /apps/hive/warehouse/database/table/ with the user and owner of the files intact. Once it's copied, Hive then reads the copied files, but does so in a way that bypasses Ranger. So, if the original user and owner of the files copied isn't owned by hive, then the error occurs.
-rw-r--r-- 3 test1 users 21226781 2017-02-07 16:27 /apps/hive/warehouse/sample_database.db/tablename/csvfile.csv
Created 02-07-2017 05:48 PM
Hi @Leon,
This is a known issue, and Hive engineering is aware and working on it. There is not an Apache Hive JIRA created for this issue yet, but I will update this thread once one has been filed.
Thanks,
@Darwin
Created 02-07-2017 05:48 PM
Hi @Leon,
This is a known issue, and Hive engineering is aware and working on it. There is not an Apache Hive JIRA created for this issue yet, but I will update this thread once one has been filed.
Thanks,
@Darwin
Created 02-07-2017 09:53 PM
As a workaround, you can create an external table on /user/test1/csvfolder, and then insert records from this table into your internal table using "INSERT INTO TABLE tbl SELECT * FROM tbl_ext; However, creating external table requires write permission on your csvfolder, although the files there will be left intact.
Created 10-21-2017 09:55 AM
Same problem here. Is it fixed in HDP 2.6?
Created 12-06-2018 07:07 PM
Same here with HDP 2.6. The datafolder is owned by HDFS:HDFS. Ranger granted full access to user Hive. And yet when creating external table user hive is denied access.