Member since
12-03-2016
91
Posts
27
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
12182 | 08-27-2019 10:45 AM | |
3452 | 12-24-2018 01:08 PM | |
12134 | 09-16-2018 06:45 PM | |
2677 | 12-12-2016 01:44 AM |
08-31-2018
02:33 AM
This does not respond the question but some topic related to the question. As far as I understand @tuxnet is asking if he would be able to use the AD only as an LDAP provider (he states clearly he doen't have the chance to use Kerberos service from AD) and use ONLY an independent MIT KDC as Kerberos authentication provider. So in this case there is not any one-way or cross domain trust between MIT KDC and AD involved. He will have to use some SSSD configuration for LDAP + Kerberos using AD only as an identity/authorization provider and KDC as the authentication/chpass provider, and will have to do the cluster kerberization using only the MIT KDC server. Also the user credentials (passwords) for the cluster will be maintained only in the KDC (where the principals reside) and so he will have to keep password sync between the two sources from some external mean.
... View more
08-13-2018
01:45 AM
@Geoffrey Shelton Okot Thank you for your response. I agree with you that by default Zeppelin is designed to run as a single user (usually named zeppelin), but in the official documentation for version 0.7.0 (the first link is provided above) they state they are including support to the USER IMPERSONATION (as opposed to the User Proxy settings used in the connection oriented interpreters like %jdbc(hive) or %livy) for use with some execution environment interpreters like %sh and %spark/%spark2. This approach is more or less well documented and I it works OK with local or centralized users "without kerberos" by using the ZEPPELIN_IMPERSONATE_USER variable (defined at login time) and the ZEPPELIN_IMPERSONATE_CMD hook in the zeppelin-env.sh file (and also setting ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false for not using proxy in this case for Spark). The problem is that this approach doesn't fit well with LDAP/Kerberos (I really don't know if it may work with direct Active Directory login into Zeppelin) because the LDAP login doesn't use Kerberos (or at least I haven't found how to do it) and for this reason the impersonated user don't have a valid ticket when the zeppelin user turns into him (by using sudo) to launch the shell interpreters. For what I have checked in the Zeppelin 0.7.0's Apache Shiro documentation, there is no support for direct authentication against Kerberos yet (only for LDAP, AD or PAM) so It seems it is not possible to solve this problem at this moment - except maybe by using AD and not LDAP+Kerberos. I will test the support of Spark by using %livy with proxy user configuration, and this may be a partial patch; but pitifully this is not a fully satisfactory solution for my deployment; because my users need to be able to run shell commands like "hdfs ...", "yarn ...", etc from the shell interpreter with their authenticated users (not the zeppelin one), and currently this doesn't seems to be possible with this Zeppelin version when using a kerberized cluster.
... View more
08-12-2018
08:31 PM
1 Kudo
I'm trying to make the Zeppelin Notebook run as the logged user for the %sh and %spark interpreters when using centralized users provided by combining LDAP + Kerberos with SSSD. I was able to make this work in a NON Kerberized cluster by using the steps suggested in the following links: https://zeppelin.apache.org/docs/0.7.0/manual/userimpersonation.html https://community.hortonworks.com/articles/99730/how-to-enable-user-impersonation-for-sh-interprete-1.html However this doen't work in a Kerberized cluster with users identity/authentication handled by SSSD form LDAP+ Kerberos. The problem is the "hack" used in Zeppelin to run as the requeste user is to put the "zeppelin" user in the "sudoers", and define the variable ZEPPELIN_IMPERSONATE_CMD so it will include a sudo su - ${ZEPPELIN_IMPERSONATE_USER} bash -c before the execution of the interpreter. The problem with this is because the initial login in Kerberos is done using LDAP so no Kerberos ticket is issued, and later by using "sudo" from a privileged user you turn into the requested user, but as you are not providing any password you are not hitting the "authentication" stage of SSSD and so you are not doing the "kinit" needed to contact the Kerberos KDC to get the user's Kerberor TGT (granting ticket). For this reason the local comands on the Linux host running Zeppelin will work, but if you try to execute any command on the kerberized cluster from %sh interpreter, as for example "hdfs dfs -ls" or "yarn application -list", IT WILL FAIL telling you don't have the required TGT tickets: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] The same happens when using the %spark2 interpreter for the vary same reason. The same problem happens if you log into the edge server as root and do a "su - <user>" to a regular user, but in this case I can execute a "kinit" manually and provide the credentials to get the Kerberos tickets. After this all works as expected. The only fix I was able to find, is to request the user to login with ssh to the Zeppelin server in order to provide his password and get the TGT. After this the Zeppelin Impersonation will work (once the ticket is validated with kerberos any session will share the ticket). I guess this may work I the login to Zeppelin (using Apache Shiro) would be done using Kerberos instead of LDAP, because this would launch the required Kinit process, but I was not able to find any documentation on how to do that. Does anybody knows what is the way of making this to work when using Kerberos with LDAP and MIT KDC (not AD)? Best regards.
... View more
Labels:
- Labels:
-
Apache Zeppelin
08-12-2018
08:21 PM
This will not work in a kerberized cluster (at least with SSSD using LDAP + MIT Kerberos) because the user impersonation in Zeppelin is implemented by doing a "sudo su - <user>" to the requested user. In this case the whoami above will work ok, but If you try to execute any command involving the Hadoop cluster, for example "hdfs dfs", it will fail because your user doen't have the required Kerberos ticket obtained with "kinit". The "kinit" is done automatically when you login in the console or by using ssh and providing the use's password (the Linux host will contact Kerberos server and get the ticket). But when using "su" from root or "sudo su" from a user in the sudoers, you are not providing the user's password and for this reason you have to execute kinit an provide the user's password manually to get the Kerberos ticket. This is not posible with Zeppelin because it doesn't provides an interactive console.
... View more
08-06-2018
02:50 AM
1 Kudo
After some tests and reading the official Hive documentation I'm answering this by myself. Both sources are incomplete and confuse and I guess it's because they mix the required configuration for Hive 0.13.x and for Hive after 0.14 (what is used in HDP 2.5.x and above). After changing authorization to SQLStdAuth and setting "Run as end user instead of Hive user" (hive.server2.enable.doAs) to false you have to In Custom hive-site: add the user you want to use as Hive administrator, for example admin to the default list o users with admin role: hive.users.in.admin.role = hive,hue,admin In hive-site.xml, corresponding to the General and Advanced hive-site sections on Ambari: check you have the following settings: # General section: hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory # Need to add the second class to the comma separated list
hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly # Advanced hive-site section: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator In hiveserver2-site.xml corresponding to Advanced hiveserver2-site in Ambari: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
Note the class used as "authorization.manager" in hive-site and in hiveserve2-site have similar names but are different, the first one is "SQLStdConfOnlyAuthorizerFactory" and the second "SQLStdHiveAuthorizerFactory". Ambari will guide you with some of these settings once you select SQLStdAuth authorization, but this is the complete picture of what is needed. For further reference check: https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization
... View more
08-05-2018
11:19 PM
I also want to complement with the information that in the "Data Access" manual they refer to hive-site but in the HW support article they talk about settings in hiveserver2-site . For what I was able to find out, the first file (hive-site in Ambari) corresponds to /etc/hive/conf/hive-site.xml and the second (hiveserver2-site in Ambari) corresponds to /etc/hive/conf/conf.server/hiveserver2-site.xml and many of the authorization/authentication parameters are repeated but with different values. There is also another file named "hive-site.xml" file inside the "conf.server" folder but, it seems to have almost the same content that the one in /etc/hive/conf except for a couple of credential store parameters. What a mess is this hive configuration!
... View more
08-05-2018
09:54 PM
I'm trying to find out how to configure SQL-based authorization in Hive (HDP 2.6.5) but I have found 2 official sources with contradictory information. On one side you have the HDP 2.6.5 Data Access Manual, in the Securing Apache Hive chapter instructs you to set these class properties (among other configurations changes): hive.security.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.plugin.sql hive.security.authenticator.manager = org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
And on the other side I have found the official Hortonworks support article "How to Setup SQL Based authorization in Hive With an example" in this community forum; which states that you have to set a different set of configurations, including these properties with different authorization and authentication classes: hive.security.authorization.manager= org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator Which one of the two different documentation sources I should trust and is more appropriate or correct for having SQL-based authorization in Hive?
... View more
Labels:
- Labels:
-
Apache Hive
07-19-2018
12:43 AM
This is a useful article, but I would be better by explaining what the different main configurations do instead of listing interpretation of the best use case (as perceived by the creator) for each combination. By knowing what each of these few options do or how do they affect the behavior regarding the matching of users and groups from LDAP, I'm pretty sure most of us IT professionals will be able to find out in which case each combination is more appropriate for our use case. Indeed that is a recurrent problem with Ranger documentation in HDP and with many other aspects of security components, you usually will find out "subjective" interpretation of what combination of settings are best for this or that scenario, but the objective description of how each options behaves is much harder to find, and sometimes the only way to find out this is going to the source code.
... View more
05-04-2018
05:45 PM
Same problem here when adding an extra node (after initial install) with HDP 2.6 and WITHOUT enabling two way ssl in Ambari. And
also same solution, adding the following property to
/etc/ambari-agent/conf/ambari-agent.ini after install and registration failure and
restarting the process: force_https_protocol=PROTOCOL_TLSv1_2
... View more
01-18-2018
03:02 AM
Just in case this may save time to other people. The configuration included with HDP 2.5.x and Ambari 2.5 is not compatible with using Ranger Tagsync with SSL, so there is no "Advanced ranger-tagsync-policymgr-ssl" section or anything like that on Ranger (0.6.0) configuration from Ambari. The first response above refers to the parameters included in the file ranger-tagsync-policymgr-ssl.xml included with Ambari 2.6 (y believe in HDP 2.6.x). This in included in the patch discussed in the following URL: https://issues.apache.org/jira/browse/AMBARI-18874 There is an Ambari patch for HDP 2.5 but I was not able to make it work with Ambari 2.5 and Ranger 0.6.0 (included with HDP 2.5.6) so the way to make it work was to change include the file /etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml from the patch above edited by hand and in the section "Advanced ranger-tagsync-site" modify the parameter ranger.tagsync.dest.ranger.ssl.config.filename
which incredibly (and shamefully) points to a keystore!!! in the default HDP 2.5 configuration to point to this file like this: ranger.tagsync.dest.ranger.ssl.config.filename=/etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml After this you will also need to change the credentials store file at rangertagsync.jceks to include the keys ssltruststore and sslkeystore with the correct values. There are other articles on how to do this. Hopefully in HDP 2.6 things are going to be easier 😞
... View more