I have a clean CDH5.3 virtual machine which I am trying to get Sentry working on. I've following the authentication and authorization instructions and gotten to the point where I want to enable Sentry automatic syncing of HDFS ACL's. Following the documentation has all the prerequisites taken care of and it seems all I need to do is:
Under the Service-Wide category go to Security.
Check the Enable Sentry Synchronization checkbox.
When I do this the HDFS NameNode won't start and the log indicates that thre is a authorization provider class that needs to be configured:
Failed to start namenode. java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.sentry.hdfs.SentryAuthorizationProvider not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2079) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1089) at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:621) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:607) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:754) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:738) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1427) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1493) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.sentry.hdfs.SentryAuthorizationProvider not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2047) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2071) ... 7 more Caused by: java.lang.ClassNotFoundException: Class org.apache.sentry.hdfs.SentryAuthorizationProvider not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1953) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2045) ... 8 more
I'm unsure if this is really the problem and, if it is, how to resolve it. I did notice that in the instructions for setting up syncing from the CLI there is this section added to the hdfs-site.xml:
<property> <name>dfs.namenode.authorization.provider.class</name> <value>org.apache.sentry.hdfs.SentryAuthorizationProvider</value> </property>
Any assistance would really make the world of difference to my day.
I'm not sure I'm heading down the right path here but after a bunch of Googling it seems that the HDFS ACL sync functionality is implemented in the sentry-hdfs-namenode-plugin-1.4.0-cdh5.3.1.jar
When I search my CDH5.3 virtual machine the jar isn't found. Could there be a problem with the virtual machine configuration?
CDH5.3 ships all necessary jars out of box.
Have you checked the list of Prerequisites for integrating Sentry with HDFS?
So these are the pre-requisites:
1. CDH 5.3.0 (or later) managed by Cloudera Manager 5.3.0 (or later)
2. (Strongly Recommended) Implement Kerberos authentication on your cluster.
3. You must use the Sentry service, not policy file-based authorization.
4. Enabling HDFS Extended Access Control Lists (ACLs) is required.
5. There must be exactly one Sentry service dependent on HDFS.
6. The Sentry service must have exactly one Sentry Server role.
7. The Sentry service must have exactly one dependent Hive service.
8. The Hive service must have exactly one Hive Metastore role (that is, High Availability should not be enabled).
Do I have them?
3. yes: as per http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/sg_sentry_service_confi... cloudera manager instructions
4. yes: as per http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_sg_hdfs_ext_acls.ht... (although there is a cloudera manager configuration parameter for doing this)
5. yes: I think this is the case for the VM. I looked at the roles and instances and there seems to only be 1 sentry service
6. yes: again I think the VM is setup this way. I looked around and it seemed to be the case
7. yes: as above
8. yes: as above
As far as following the instructions goes; it seemed to me that be the time I got to this part of the security configuration he system didn't need much changing. Unfortunately it doesn't seem to work for me.
Can you tell me what jar I should be looking for on my system so that I can be certain the installation is ok?
I'm doing an upgrade at the moment. The VM is 5.3.0 and I'm hoping that 5.3.1 will give me a better experience.
ok, no lucj with the upgrade path. Can't manage to get the upgrade wizard to get past the 2nd screen... I think it's time I give up on this software...
This is clearly a classpath issue.
Couple of questions :
1) Is this being installed via CM ? or are you doing a fresh CDH5.3 installation ?
2) can you verify that 'sentry-hdfs-1.4.0-cdh5.3.0.jar' if found on the system ?