Member since
07-06-2018
5
Posts
1
Kudos Received
0
Solutions
07-27-2018
06:05 AM
Hi, We have Kerberos configured in our Hadoop cluster. We did a Wizard installation (https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_sg_intro_kerb.html), it works well. We try to have a high level of availability, we have configured a secondary kdc-server (we followed the kerberos documentation). We have a replication of the credentials from the first Kerberos server to the second (like in the topic : https://community.hortonworks.com/articles/92333/configure-two-kerberos-kdcs-as-a-masterslave.html) We set Kerberos configuration on Cloudera Manager to add the secondary kdc server. The configuration generate by Cloudera in /etc/krb5.conf contains : [realms]
XXXXXX.COM = {
kdc = master1.com
admin_server = master1.com
kdc = worker1.com
} We have the following configuration: master1 : Kerberos server + Namenode (active) HDFS worker1 : Kerberos server + Namenode HDFS worker2 : Kerberos client + Datanode HDFS We are testing the replication of Kerberos. Case 1 : stop Kerberos server (kdc + kadmin) on master1 and init user ticket on worker2 with kinit It works well. Case 2 : stop Kerberos server (kdc + kadmin) and Namenode HDFS on master1 (to simulate the crash of the server master1) Normaly, the Namenode on worker1 should be activate. But, there is an error : "This role's process exited. This role is supposed to be started." on worker1. Message in log: PriviledgedActionException as:hdfs/worker1.com@XXXXXX.COM (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Connection refused (Connection refused)) Conclusion/Question So my conclusion is that the Namenode on worker1 doesn't use the secondary kdc (there is nothing in the kadmin.log on the worker1). But if I do a kinit manually, that works. So, is not a problem of Kerberos. If the server with the main Kerberos kdc crash, the hadoop services crash too.. This is a big problem. Do you have a solution? Or any suggestion? Thank you, Martin.
... View more
07-23-2018
12:48 AM
1 Kudo
Hi, We have tried to configure Sentry in our hadoop cluster. On HUE interface with “hdfs” user, queries on hive and impala work well. We have the following configuration: - master1: namenode + kerberos server + Sentry + Impala Catalog Server + Hue Server + Hive Metastore/Server2 - master2: namenode (standby) + Kerberos secondary server + Impala StateStore - worker1: datanode + Impala Daemon - worker2: datanode + Impala Daemon After this, we want to try with a specific user and we have done the following steps: On the master server: #We create the user on OS (Ubuntu 14.04)
master1$ adduser dev1
master1$ addgroup sentry_dev
master1$ usermod -a -G sentry_dev dev1
#We create the user on Kerberos
master1$ sudo kadmin.local
kadmin.local: addprinc dev1
kadmin.local: exit On HUE: - Create user dev1 - Create group sentry_dev - Put user dev1 in group sentry_dev On HUE query editor: CREATE ROLE dev_rol;
GRANT ALL ON DATABASE default TO ROLE dev_rol;
GRANT ROLE dev_rol TO GROUP sentry_dev; After this step, dev1 has access to the database default on HUE, only with HIVE. But for Impala, there is the message "User 'dev1' does not have privileges to access: default.*". After some research, we found that we need to have the user on each node. So we did this step for each node (master2, worker1, worker2) on the OS: $ adduser dev1
$ addgroup sentry_dev
$ usermod -a -G sentry_dev dev1 Now we have access to Impala tables, but that means we have to create each new user on each OS node manually. Thus the more user we have, the more complicated it could be to manage it, if we have 20 users and 10 nodes etc. Do you have a better solution? Is there something wrong in our configuration? Thank you, Martin.
... View more
Labels: