Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive user cannot read from HDFS on "load data inpath" if hive.server2.enable.doAs=false and HDFS access granted by Ranger

avatar
New Contributor

This is likely a bug.

On our HDP 2.5 clusters, we have:

hive.server2.enable.doAs=false

The HDFS folder /user/test1/csvfolder has read,write,execute permissions granted to the user hive by Ranger

While loading data into an internal Hive table using "load data inpath '/user/test1/csvfolder into table tablename', we get an error.

INFO : Loading data to table tablename from hdfs://sandbox.hortonworks.com:8020/user/test1/csvfolder ERROR : Failed with exception org.apache.hadoop.security.AccessControlException: Permission denied. user=hive is not the owner of inode=csvfile.csv at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:250) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:227) at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermission(RangerHd fsAuthorizer.java:307)

This error occurs both using Beeline as well as Hue's hive client.

Here's why I think the error occurs. When doing a 'load data in path', it appears that hive copies the data files from hdfs into /apps/hive/warehouse/database/table/ with the user and owner of the files intact. Once it's copied, Hive then reads the copied files, but does so in a way that bypasses Ranger. So, if the original user and owner of the files copied isn't owned by hive, then the error occurs.

-rw-r--r-- 3 test1 users 21226781 2017-02-07 16:27 /apps/hive/warehouse/sample_database.db/tablename/csvfile.csv

1 ACCEPTED SOLUTION

avatar
Rising Star

Hi @Leon,

This is a known issue, and Hive engineering is aware and working on it. There is not an Apache Hive JIRA created for this issue yet, but I will update this thread once one has been filed.

Thanks,

@Darwin

View solution in original post

4 REPLIES 4

avatar
Rising Star

Hi @Leon,

This is a known issue, and Hive engineering is aware and working on it. There is not an Apache Hive JIRA created for this issue yet, but I will update this thread once one has been filed.

Thanks,

@Darwin

avatar
Master Guru

As a workaround, you can create an external table on /user/test1/csvfolder, and then insert records from this table into your internal table using "INSERT INTO TABLE tbl SELECT * FROM tbl_ext; However, creating external table requires write permission on your csvfolder, although the files there will be left intact.

avatar
Explorer

Same problem here. Is it fixed in HDP 2.6?

avatar
Explorer

Same here with HDP 2.6. The datafolder is owned by HDFS:HDFS. Ranger granted full access to user Hive. And yet when creating external table user hive is denied access.