Member since
12-03-2016
88
Posts
19
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4526 | 08-27-2019 10:45 AM | |
900 | 12-24-2018 01:08 PM | |
398 | 12-12-2016 01:44 AM |
12-02-2018
04:13 PM
I found problems trying to validate the information in this article, and after doing my own research I have to say that it's inaccurate and in some aspects simply wrong. First of all it must be clear that the "ranger.ldap.*" set of parameters which should be configured under Ambari under "Ranger >> Config >> Advanced >> Ldap Settings" and "Advanced ranger-admin-site" are related only to Ranger Admin UI authentication and this has nothing to do with Ranger Usersync (different properties, different code, different daemon) which must be configured completely in the "Ranger >> Config >> Ranger User Info" section. This article mix the two set of LDAP related configurations for two different components in Ranger and this is confusing and not correct. All that I state here may be verified by looking at the source code of the following classes from the "ranger" and "spring-ldap" projects in GitHub: apache.ranger.security.handler.RangerAuthenticationProvider
org.springframework.security.ldap.authentication.BindAuthenticator
org.springframework.security.ldap.authentication.LdapAuthenticationProvider/AbstractLdapAuthenticationProvider Talking about "Ranger Admin LDAP authentication" the only two parameters you will need are the following: ranger.ldap.url = http://ldap-host:389
ranger.ldap.user.dnpattern = uid={0},ou=users,dc=example,dc=com<br> This is because the RangerAuthenticationProvider class first uses the method "getLdapAuthentication()" which in place will use the Spring's BindAuthenticator class with default parameters except from the previous properties. This will try to do a BIND as the DN obtained from "ldapDNPattern" replacing "{0}" with the username, and if this succeeds, the authentication will be granted to the user and nothing else is used!! The only case were the remaining "ranger.ldap.*" parameters are used is when "getLdapAuthentication()" fails, for example by setting the wrong value for "ldap.user.dnpattern" as in the example above, where the LDAP manager's DN will be used to do the Bind with the username's provided password. When the call to "getLdapAuthentication()" fails, Ranger will next try a call to the more specialized method "getLdapBindAuthentication()" and is this method that will use all the other "ranger.ldap.{bind|user|group}.*" properties! This time BindAuthenticator will be configured to bind with the provided "ranger.ldap.bind.dn/password" and will search for the user entry and their groups with the other properties, etc ... But even in this case there is another IMPORTANT error in the article above:
the pattern in "ranger.ldap.group.searchfilter" is wrong, because this is handled by the class DefaultLdapAuthoritiesPopulator and this will replace the '{0}' with the DistinguishedName (DN) of the user (NOT the username) and instead ist the '{1}' that will be replaced with the username. So, if you want to use the configuration above, you should replace {0} with {1} or even better just use "member={0}' as your group searchfilter. Regarding the "authorization" phase, in both the previous methods, if the authorization succeeds, then the group/role authorities for the user will be searched from LDAP (using Spring's DefaultLdapAuthoritiesPopulator class), but ONLY if both
rangerLdapGroupSearchBase AND rangerLdapGroupSearchFilter are defined
and not empty. But even in this case (I have still not tested this, but looking at the code It seems clear) I'm almost sure that the list of "grantedAuths"
obtained from LDAP are never used by Ranger because at the
end of both "getLDAP*Authentication()" methods the grantedAuths list is overwritten using the chain of calls to the following methods: authentication = getAuthenticationWithGrantedAuthority(authentication) >> List<GrantedAuthority> grantedAuths = getAuthorities(authentition.getName()); // pass username only >>>> roleList = (org.apache.ranger.biz.UserMgr)userMgr.getRolesByLoginId(username); //overwrite from Ranger DB I don't know if this is the desired behavior or it's a bug in the current RangerAuthenticationProvider that will be changed in the future (otherwise is not clear why to use the LdapAuthoritiesPopulator upstream), but it's the way it seems to be done right now. In conclusion, for Ranger Admin authentication, IF you just provides the right value for the "ranger.ldap.url" and "ranger.ldap.user.dnpattern" property, none
of the remaining "ranger.ldap.group.*" parameters will be used and the user roles
will be managed by Ranger from the "Admin UI -> Users" interface.
... View more
10-29-2018
06:48 PM
This information (as many others) is wrong in the official HDP Security course from Hortonworks. In the HDFS Encryption presentations of the course it states that to create an HDFS admin user to manage EZ is enough with setting the following (copy/paste here): dfs.cluster.administrators=hdfs,encrypter
hadoop.kms.blacklist.DECRYPT_EEK=hdfs,encrypter
... View more
10-29-2018
06:32 PM
Also check this: https://www.hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/
... View more
10-27-2018
09:58 PM
Nice work Greg, a wonderful, clear and very detailed article! Based on this
and other articles of yours in this forum, it seems clear that the people of
Hortonworks should seriously consider giving you a greater participation
in the elaboration and supervision of the (currently outdated, fuzzy, inexact and overall low qualilty) material presented in their
expensive official courses. The difference in quality between his work and the work published there is so great that it is sometimes rude. All my respects Mr!!
... View more
09-16-2018
06:45 PM
2 Kudos
The previous answer gives the response on how to reset the password in Ranger for a fixed user/password combination with the value "admin:admin". But if you want to reset or check the password value on Ranger for any user or password combination you have to use the following. Passwords in Ranger are saved in the password column of the x_portal_user table as MD5 hashes in the format password{login} . For instance the hash for the amb_ranger_admin account with secret123 as password will be as follows: $ echo -n 'secret123{amb_ranger_admin}' | md5sum
md5sum4d9f6af4210833cb982d27c9042d9ac1 - Of course you may also use this and pgsql/mysql to change the password for any user via command line as requested by another user.
... View more
09-16-2018
06:35 PM
1 Kudo
Respect to the problem of keeping in sync amb_ranger_admin password between Ranger and Ambari, via the Web UI you can change these values as follows:
Ambari Server: into the Ambari Server UI at the Ranger -> Config -> Advanced -> Admin Settings section as stated in the manuals and as you said you had already done. Ranger: into the Ranger Admin UI into Settings -> User/Groups section and searching "User Name: amb_ranger_admin". However if you have already tested and succeeded logging in to the Ranger Admin with this user and the password you have already setup in Ambari this may no be the problem. To go to the source and check the real passwords in the Ambari and Ranger configuration you have to go to the info in the corresponding databases.
In Ambari the information is saved into a JSON text entry in the field "ranger_admin_password" from the table "clusterconfig". This table keeps all versions for each configuration type so you will have to search for the last entry on the type "type_name='ranger-env'" and select the field "config_data". A SELECT like following will be useful: SELECT version,create_timestamp,type_name,config_data FROM ambari.clusterconfig WHERE type_name LIKE 'ranger-env' ORDER BY version DESC LIMIT 1
In Ranger you have to check the x_portal_user table and search the password field for the user with the login_id=amb_ranger_env with a select like this: SELECT login_id, password, status from x_portal_user
In this case the password field is a hash combining the password and the user name; so you will have to use the following to compare with the previous value: echo -n 'yourpassword{amb_ranger_admin}' | md5sum
If the previous passwords are differents and you have done the steps from the beginning in each UI then you have to check why Ambari (most certainly) is not updating your configuration. On the other side if the passwords match, then probably you have another problem with Ambari configuration.
... View more
09-16-2018
02:52 PM
I keep asking myself because so many people feels the necessity to respond something even when then don't know how to response the original question raised by the user with the problem. When someone post a question here it's supposed they have checked the (sometimes very) basic instructions in the manual, and even more if there is a clear description about that, and about the problem persisting after doing the basic steps as in this case. So, it is a matter of personal pride or winning some points with this forum or even Hortonworks company for the number of useless responses you submit? Not a single response here address the real problem the user is asking about, and most of them are treating the one asking as a total dummy. Please filter yourself, read carefully the submitting question and think I you really are contributing something valuable or asking some feedback important for the "real" problem, before responding as a reflex act, to avoid lowering the quality of this forum and making people lose time reading again and again the very same "quotes" from the (many times incomplete) manuals.
... View more
09-16-2018
12:41 AM
This article is really very useful but has a silly but confusing (specially for HDP newbies) error where all occurrences of "Ranger user id" and "Ranger Admin Server" must be replaced by "Atlas User ID" and "Atlas Admin Server" respectively.
... View more
09-13-2018
12:53 PM
At least for version 2.6.3 and above the section "Running import script on kerberized cluster" is wrong. You don't need to provide any of the options (properties) indicated (except maybe the debug one If you want to have debug output) because they are automatically detected and included in the script. Also at least in 2.6.5 a direct execution of the script in a Kerberized cluster will fail because of the CLASSPATH generated into the script. I had to edit this replacing many single JAR files by a glob inside their parent folder in order for the command to run without error. If you have this problem see answer o "Atlas can't see Hive tables" question.
... View more
09-13-2018
12:36 PM
I'm running HDP 2.6.5 and I have experienced (and SOLVED) a problem that may be related with this and in any case may be of help for someone else with similar problems importing entities from Hive. My cluster is Kerberized and when I run the import-hive.sh script as described in https://community.hortonworks.com/articles/61274/import-hive-metadata-into-atlas.html I get the following error: $ /usr/hdp/2.6.5.0-292/atlas/hook-bin/import-hive.sh -Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/atlas/conf/atlas_jaas.conf Using Hive configuration directory [/etc/hive/conf]Log file for import is /usr/hdp/2.6.5.0-292/atlas/logs/import-hive.log
Usage 1: import-hive.sh [-d <database> OR --database <database>] Imports specified database and its tables ...
...
Failed to import Hive Meta Data!!
To see what is happening I edited the import-hive.sh script and added an echo before the executed command: echo "${JAVA_BIN}" ${JAVA_PROPERTIES} -cp "${CP}" org.apache.atlas.hive.bridge.HiveMetaStoreBridge $allargs The executed JAVA command is shown to be something like this: $ /usr/java/jdk1.8.0_152/bin/java -Datlas.log.dir=/usr/hdp/2.6.5.0-292/atlas/logs -Datlas.log.file=import-hive.log -Dlog4j.configuration=atlas-hive-import-log4j.xml \
-Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/atlas/conf/atlas_jaas.conf \
-cp ":<VERY-LONG-LIST-OF-JARS>:/usr/hdp/2.6.5.0-292/tez/lib/*:/usr/hdp/2.6.5.0-292/tez/conf" \
org.apache.atlas.hive.bridge.HiveMetaStoreBridge \
-Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/atlas/conf/atlas_jaas.conf From this I found out two things: The property definition options indicated in the article for the case of a Kerberized cluster ARE NOT NECESSARY because they are automatically detected included by the script from the HDP environment. You may see they are duplicated at the end of the command MOST IMPORTANT: the class path is malformed because it repeats many JARs and include every single JAR file in many HADOOP lib folders (including all the jars into hive-client/lib) instead of using a glob ("/lib/*") for this. This seems to hit some parameter length limit on Bash (or Java) and seems to be the reason for the command to fail!! By editing the CLASSPATH and replacing near one-hundred listed JAR files by their single parent folders with a glob. I was able to drastically reduce the lenght of the executed command and was able run this without errors as shown bellow: $ /usr/java/jdk1.8.0_152/bin/java -Datlas.log.dir=/usr/hdp/current/atlas-server/logs -Datlas.log.file=import-hive.log -Dlog4j.configuration=atlas-hive-import-log4j.xml \ -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/atlas/conf/atlas_jaas.conf \ -cp ':/usr/hdp/current/atlas-server/hook/hive/atlas-hive-plugin-impl/*:/usr/hdp/current/hive-client/conf:/usr/hdp/current/hive-client/lib/*:mysql-connector-java.jar:postgresql-jdbc3.jar:postgresql-jdbc.jar:/usr/hdp/2.6.5.0-292/hadoop/conf:/usr/hdp/2.6.5.0-292/hadoop/lib/*:/usr/hdp/2.6.5.0-292/hadoop/*:/usr/hdp/2.6.5.0-292/hadoop-hdfs/:/usr/hdp/2.6.5.0-292/hadoop-hdfs/lib/*:/usr/hdp/2.6.5.0-292/hadoop-hdfs/*:/usr/hdp/2.6.5.0-292/hadoop-yarn/lib/*:/usr/hdp/2.6.5.0-292/hadoop-yarn/*:/usr/hdp/2.6.5.0-292/hadoop-mapreduce/lib/*:/usr/hdp/2.6.5.0-292/hadoop-mapreduce/*:/usr/hdp/2.6.5.0-292/tez/*:/usr/hdp/2.6.5.0-292/tez/lib/*:/usr/hdp/2.6.5.0-292/tez/conf' \ org.apache.atlas.hive.bridge.HiveMetaStoreBridge Search Subject for Kerberos V5 INIT cred (<<DEF>>, sun.security.jgss.krb5.Krb5InitCredential) Search Subject for SPNEGO INIT cred (<<DEF>>, sun.security.jgss.spnego.SpNegoCredElement) Search Subject for Kerberos V5 INIT cred (<<DEF>>, sun.security.jgss.krb5.Krb5InitCredential) $ After this all the entities from Hive are imported into Atlas. I hope this will be of help to someone else and somebody in Hortonworks would fix the script and related documentation.
... View more
09-13-2018
01:29 AM
I have experienced this problem after changing Ambari Server to run as a non privileged user "ambari-server". In my case I can see in the logs (/var/log/ambari-server/ambari-server.log) the following: 12 Sep 2018 22:06:57,515 ERROR [ambari-client-thread-6396] BaseManagementHandler:61 - Caught a system exception while attempting to create a resource: Error occured during stack advisor command invocation: Cannot create /var/run/ambari-server/stack-recommendations
This error happens because in CentOS/RedHat /var/run is really a symlink to /run which is a tmpfs filesystem mounted at boot time from RAM. So if I manually create the folder with the required privileges it won't survive a reboot and because the unprivileged user running Ambari Server is unable to create the required directory the error occurs. I was able to partially fix this using systemd-tmpfiles feature by creating a file /etc/tmpfiles.d/ambari-server.conf with following content: d /run/ambari-server 0775 ambari-server hadoop -
d /run/ambari-server/stack-recommendations 0775 ambari-server hadoop - With this file in place running "systemd-tmpfiles --create" will create the folders with the required privileges. According to the following RedHat documentation this should be automagically run at boot time to setup everything: https://developers.redhat.com/blog/2016/09/20/managing-temporary-files-with-systemd-tmpfiles-on-rhel7/ However sometimes this doesn't happens (I don't know why) and I have to run the previous command manually to fix this error.
... View more
09-12-2018
11:37 PM
In a cluster managed by Ambari, the Atlas admin password for the File Authentication mode must be changed from inside Ambari Server or it will be rewritten after a service restart. This value may be found in section Configs-> Advanced -> Advanced atlas-env -> Admin password as shown in the image bellow
... View more
09-10-2018
07:04 PM
1 Kudo
The response from @Vadim Vaks doen't address the original question and is misleading.
on one side @Wendell Bu is not asking about the supported authentication methods on Atlas; but HE says clearly he wants to use LDAP authentication and asks about the Apache Atlas support for SSL/TLS when using LDAP authentication. on the other side -- if the question where about available user authentication methods in Apache Atlas -- the answer is wrong because the available user authentication methods are "File", "LDAP" and "Kerberos". These methods are configured with the three corresponding properties:
atlas.authentication.method.file = true/false , atlas.authentication.method.ldap<code> = true/false , atlas.authentication.method.kerberos<code> = true/false and their corresponding subtree's properties. The mentioned values "simple|kerberos" are for the "Service Authentication Method" which is related to the identity used by the Atlas service to run and to interact with other cluster services. This is a reasonable confusion because the properties for this configuration have names which are very similar to the previous ones atlas.authentication.[method/keytab/principal] , but this has little to do with the question. For more details see the following links:
http://atlas.apache.org/0.8.1/Authentication-Authorization.html http://atlas.apache.org/Security.html Regarding to the real question, this article seems to be the answer:
https://community.hortonworks.com/articles/148958/steps-to-setup-atlas-with-ldaps-ssl.html It's not mentioned in the article but you will have also to chante your LDAP URL: atlas.authentication.method.ldap.url = ldaps://authserv.ict4v.org:636 because it seems Apache Atlas (like most of the Hadoop components) don't fully support TLS yet, but only old LDAP over SSL (LDAPS).
... View more
08-31-2018
02:49 AM
Yes, you will have to use basically the same configuration done when using a combination of OpenLDAP and MIT KDC for authentication. The only difference is you will be using AD as your LDAP server instead of OpenLDAP, and of course you will have to consider the different schemas for users/groups (samAccountName vs uid, etc).
... View more
08-31-2018
02:47 AM
If the AD is not providing Kerberos service to the hosts in the cluster (as stated in the question) then there is not chance of the user requesting any TGT from the AD KDC. In that case AD may only be used as an LDAP users identity provider.
... View more
08-31-2018
02:41 AM
This article is very illustrative but as most of the references in this topic doesn't addresses the problem of Linux users authentication against AD with the previous hosts and kerberos client configurations. For SSSD to work with AD you will have to setup your host's default krb5.keytabs to point to the AD principals (obtained when joining the host to the AD domain) and this is conflicting with the configuration described in the article where the host is supposed to be associated to the MIT KDC realm and default domain. How do we resolve this conflict?
... View more
08-31-2018
02:33 AM
This does not respond the question but some topic related to the question. As far as I understand @tuxnet is asking if he would be able to use the AD only as an LDAP provider (he states clearly he doen't have the chance to use Kerberos service from AD) and use ONLY an independent MIT KDC as Kerberos authentication provider. So in this case there is not any one-way or cross domain trust between MIT KDC and AD involved. He will have to use some SSSD configuration for LDAP + Kerberos using AD only as an identity/authorization provider and KDC as the authentication/chpass provider, and will have to do the cluster kerberization using only the MIT KDC server. Also the user credentials (passwords) for the cluster will be maintained only in the KDC (where the principals reside) and so he will have to keep password sync between the two sources from some external mean.
... View more
08-31-2018
12:10 AM
1 Kudo
I'm trying to configure HDP security by integrating Active Directory for users authentication and MIT Kerberos KDC for HDP services. I have already read and followed the following reference documents:
https://hortonworks.com/blog/enabling-kerberos-hdp-active-directory-integration/ https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/kerb-oneway-trust.html I've also checked the official HDP course on Operation Security and it's of not much help. The information in this topic is very partial, confuse (mixes the KDC+AD case with the pure AD one) and only gives a real example for the pure AD case. In order to configure user authentication on the HDP Linux nodes I followed the SSSD AD with realmd docs (mostly from RedHat) and, based on this, I used as domains for the nodes in the cluster the AD Domain (let's say AD.COM). This is working perfect, but now, when I follow the referenced documents to kerberize the cluster with the MIT KDC it has to be configured with a different, let's say HDP.COM, domain/realm. My krb5.conf with the combined REALM's is as follow (excluding irrelevant options) as suggested in the previous guides: [libdefaults]
dns_lookup_realm = false
dns_lookup_kdc = false
rdns = false
default_realm = HDP.COM
udp_preference_limit = 1
... [realms]
HDP.COM = {
kdc = kdc01.hdp.com
admin_server = kdc01.hdp.com
}
AD.COM = {
kdc = windc.ad.com
admin_server = windc.ad.com
} [domain_realm]
.hdp.com = HDP.COM
hdp.com = HDP.COM
.ad.com = AD.COM
ad.com = AD.COM I have also created/configured the cross-real trust account krbtgt/HDP.COM@AD.COM both in the KDC and in the AD as described in the documents. My problem with this approach, is that as I have my HDP nodes configured with hostnames in the AD.COM domain/realm: ambari.ad.com, hdpmaster01.ad.com, hdpnode01.ad.com I think that with the previous configuration; using HDP.COM as default
realm and mapping the domain of the HDP hosts to the realm AD.COM
instead of HDP.COM the Ambari kerberization will not work because the host principals will be mapped to the Active Directory and not the Hadoop local KDC. So my doubts and questions before proceeding with the kerberization against the HDP.COM Unix MIT KDC server are: Do I have to change the domains in the FQDN of all the HDP nodes to the local KDC REALM's domain instead of the Active Directory one? Will SSSD with AD authentication keep working even when the host's
domains doesn't belong to AD's domain? I know in that case I would have to remove and
re-register each host to AD using the new hostname. There is some way of making this work using the Active Directory (ad.com) domain in the cluster's hosts as I currently have? I hope someone has faced this problems previously and may give me some hints or suggestions. Thanks a lot in advance. Luis
... View more
08-13-2018
01:45 AM
@Geoffrey Shelton Okot Thank you for your response. I agree with you that by default Zeppelin is designed to run as a single user (usually named zeppelin), but in the official documentation for version 0.7.0 (the first link is provided above) they state they are including support to the USER IMPERSONATION (as opposed to the User Proxy settings used in the connection oriented interpreters like %jdbc(hive) or %livy) for use with some execution environment interpreters like %sh and %spark/%spark2. This approach is more or less well documented and I it works OK with local or centralized users "without kerberos" by using the ZEPPELIN_IMPERSONATE_USER variable (defined at login time) and the ZEPPELIN_IMPERSONATE_CMD hook in the zeppelin-env.sh file (and also setting ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false for not using proxy in this case for Spark). The problem is that this approach doesn't fit well with LDAP/Kerberos (I really don't know if it may work with direct Active Directory login into Zeppelin) because the LDAP login doesn't use Kerberos (or at least I haven't found how to do it) and for this reason the impersonated user don't have a valid ticket when the zeppelin user turns into him (by using sudo) to launch the shell interpreters. For what I have checked in the Zeppelin 0.7.0's Apache Shiro documentation, there is no support for direct authentication against Kerberos yet (only for LDAP, AD or PAM) so It seems it is not possible to solve this problem at this moment - except maybe by using AD and not LDAP+Kerberos. I will test the support of Spark by using %livy with proxy user configuration, and this may be a partial patch; but pitifully this is not a fully satisfactory solution for my deployment; because my users need to be able to run shell commands like "hdfs ...", "yarn ...", etc from the shell interpreter with their authenticated users (not the zeppelin one), and currently this doesn't seems to be possible with this Zeppelin version when using a kerberized cluster.
... View more
08-12-2018
08:31 PM
1 Kudo
I'm trying to make the Zeppelin Notebook run as the logged user for the %sh and %spark interpreters when using centralized users provided by combining LDAP + Kerberos with SSSD. I was able to make this work in a NON Kerberized cluster by using the steps suggested in the following links: https://zeppelin.apache.org/docs/0.7.0/manual/userimpersonation.html https://community.hortonworks.com/articles/99730/how-to-enable-user-impersonation-for-sh-interprete-1.html However this doen't work in a Kerberized cluster with users identity/authentication handled by SSSD form LDAP+ Kerberos. The problem is the "hack" used in Zeppelin to run as the requeste user is to put the "zeppelin" user in the "sudoers", and define the variable ZEPPELIN_IMPERSONATE_CMD so it will include a sudo su - ${ZEPPELIN_IMPERSONATE_USER} bash -c before the execution of the interpreter. The problem with this is because the initial login in Kerberos is done using LDAP so no Kerberos ticket is issued, and later by using "sudo" from a privileged user you turn into the requested user, but as you are not providing any password you are not hitting the "authentication" stage of SSSD and so you are not doing the "kinit" needed to contact the Kerberos KDC to get the user's Kerberor TGT (granting ticket). For this reason the local comands on the Linux host running Zeppelin will work, but if you try to execute any command on the kerberized cluster from %sh interpreter, as for example "hdfs dfs -ls" or "yarn application -list", IT WILL FAIL telling you don't have the required TGT tickets: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] The same happens when using the %spark2 interpreter for the vary same reason. The same problem happens if you log into the edge server as root and do a "su - <user>" to a regular user, but in this case I can execute a "kinit" manually and provide the credentials to get the Kerberos tickets. After this all works as expected. The only fix I was able to find, is to request the user to login with ssh to the Zeppelin server in order to provide his password and get the TGT. After this the Zeppelin Impersonation will work (once the ticket is validated with kerberos any session will share the ticket). I guess this may work I the login to Zeppelin (using Apache Shiro) would be done using Kerberos instead of LDAP, because this would launch the required Kinit process, but I was not able to find any documentation on how to do that. Does anybody knows what is the way of making this to work when using Kerberos with LDAP and MIT KDC (not AD)? Best regards.
... View more
Labels:
08-12-2018
08:21 PM
This will not work in a kerberized cluster (at least with SSSD using LDAP + MIT Kerberos) because the user impersonation in Zeppelin is implemented by doing a "sudo su - <user>" to the requested user. In this case the whoami above will work ok, but If you try to execute any command involving the Hadoop cluster, for example "hdfs dfs", it will fail because your user doen't have the required Kerberos ticket obtained with "kinit". The "kinit" is done automatically when you login in the console or by using ssh and providing the use's password (the Linux host will contact Kerberos server and get the ticket). But when using "su" from root or "sudo su" from a user in the sudoers, you are not providing the user's password and for this reason you have to execute kinit an provide the user's password manually to get the Kerberos ticket. This is not posible with Zeppelin because it doesn't provides an interactive console.
... View more
08-08-2018
01:01 AM
We don't know anything about the type of access you are going to have to your cluster and the security concerts related with this; but as a general recommendation I think it would be better to have the management and edge nodes (green) in an external DMZ LAN, separated from the HDP/Hadoop internal network (masters+workers). You should also co-locate any KDC/AD and/or database server used with the HDP cluster into this internal network.
... View more
08-06-2018
02:50 AM
After some tests and reading the official Hive documentation I'm answering this by myself. Both sources are incomplete and confuse and I guess it's because they mix the required configuration for Hive 0.13.x and for Hive after 0.14 (what is used in HDP 2.5.x and above). After changing authorization to SQLStdAuth and setting "Run as end user instead of Hive user" (hive.server2.enable.doAs) to false you have to In Custom hive-site: add the user you want to use as Hive administrator, for example admin to the default list o users with admin role: hive.users.in.admin.role = hive,hue,admin In hive-site.xml, corresponding to the General and Advanced hive-site sections on Ambari: check you have the following settings: # General section: hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory # Need to add the second class to the comma separated list
hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly # Advanced hive-site section: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator In hiveserver2-site.xml corresponding to Advanced hiveserver2-site in Ambari: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
Note the class used as "authorization.manager" in hive-site and in hiveserve2-site have similar names but are different, the first one is "SQLStdConfOnlyAuthorizerFactory" and the second "SQLStdHiveAuthorizerFactory". Ambari will guide you with some of these settings once you select SQLStdAuth authorization, but this is the complete picture of what is needed. For further reference check: https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization
... View more
08-05-2018
11:19 PM
I also want to complement with the information that in the "Data Access" manual they refer to hive-site but in the HW support article they talk about settings in hiveserver2-site . For what I was able to find out, the first file (hive-site in Ambari) corresponds to /etc/hive/conf/hive-site.xml and the second (hiveserver2-site in Ambari) corresponds to /etc/hive/conf/conf.server/hiveserver2-site.xml and many of the authorization/authentication parameters are repeated but with different values. There is also another file named "hive-site.xml" file inside the "conf.server" folder but, it seems to have almost the same content that the one in /etc/hive/conf except for a couple of credential store parameters. What a mess is this hive configuration!
... View more
08-05-2018
09:54 PM
I'm trying to find out how to configure SQL-based authorization in Hive (HDP 2.6.5) but I have found 2 official sources with contradictory information. On one side you have the HDP 2.6.5 Data Access Manual, in the Securing Apache Hive chapter instructs you to set these class properties (among other configurations changes): hive.security.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.plugin.sql hive.security.authenticator.manager = org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
And on the other side I have found the official Hortonworks support article "How to Setup SQL Based authorization in Hive With an example" in this community forum; which states that you have to set a different set of configurations, including these properties with different authorization and authentication classes: hive.security.authorization.manager= org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator Which one of the two different documentation sources I should trust and is more appropriate or correct for having SQL-based authorization in Hive?
... View more
Labels:
07-19-2018
12:43 AM
This is a useful article, but I would be better by explaining what the different main configurations do instead of listing interpretation of the best use case (as perceived by the creator) for each combination. By knowing what each of these few options do or how do they affect the behavior regarding the matching of users and groups from LDAP, I'm pretty sure most of us IT professionals will be able to find out in which case each combination is more appropriate for our use case. Indeed that is a recurrent problem with Ranger documentation in HDP and with many other aspects of security components, you usually will find out "subjective" interpretation of what combination of settings are best for this or that scenario, but the objective description of how each options behaves is much harder to find, and sometimes the only way to find out this is going to the source code.
... View more
05-04-2018
05:45 PM
Same problem here when adding an extra node (after initial install) with HDP 2.6 and WITHOUT enabling two way ssl in Ambari. And
also same solution, adding the following property to
/etc/ambari-agent/conf/ambari-agent.ini after install and registration failure and
restarting the process: force_https_protocol=PROTOCOL_TLSv1_2
... View more
01-18-2018
03:15 PM
The configuration for Kafka with SSL is as follows (ports are the usual ones and "localhost" is not needed): listeners = SSL://:9093,SASL_SSL://:9094 SASL_xxxx means using Kerberos SASL with plaintext or ssl (depending on xxxx) So if you are using the configuration above your Kafka Broker is not using SSL and your clients don't need (or can) use SSL. If you want to enable SSL you will also need to add the following parameters to your Custom kafka-broker section: ssl.keystore.location = /etc/security/serverKeys/keystore.jks
ssl.keystore.password = **keysecret**
ssl.key.password = **keysecret**
ssl.truststore.location = /etc/security/serverKeys/truststore.jks
ssl.truststore.password = changeit
ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1
sasl.kerberos.service.name=kafka and use the corresponding configuration (except from keystore not needed on clients) for your producer and consumer clients. If you need data protection, it is also recommended to use SSL in the inter-broker communication, so you should set this too: security.inter.broker.protocol = SASL_SSL (or SSL)
... View more
01-18-2018
03:02 AM
Just in case this may save time to other people. The configuration included with HDP 2.5.x and Ambari 2.5 is not compatible with using Ranger Tagsync with SSL, so there is no "Advanced ranger-tagsync-policymgr-ssl" section or anything like that on Ranger (0.6.0) configuration from Ambari. The first response above refers to the parameters included in the file ranger-tagsync-policymgr-ssl.xml included with Ambari 2.6 (y believe in HDP 2.6.x). This in included in the patch discussed in the following URL: https://issues.apache.org/jira/browse/AMBARI-18874 There is an Ambari patch for HDP 2.5 but I was not able to make it work with Ambari 2.5 and Ranger 0.6.0 (included with HDP 2.5.6) so the way to make it work was to change include the file /etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml from the patch above edited by hand and in the section "Advanced ranger-tagsync-site" modify the parameter ranger.tagsync.dest.ranger.ssl.config.filename
which incredibly (and shamefully) points to a keystore!!! in the default HDP 2.5 configuration to point to this file like this: ranger.tagsync.dest.ranger.ssl.config.filename=/etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml After this you will also need to change the credentials store file at rangertagsync.jceks to include the keys ssltruststore and sslkeystore with the correct values. There are other articles on how to do this. Hopefully in HDP 2.6 things are going to be easier 😞
... View more
12-09-2017
08:59 PM
1 Kudo
Finally after writing out the problem to post the question I was able to find the problem for myself and I will describe it in case it may happen to someone else. The problem was there are a couple of extra properties not documented here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_security/content/configure_ambari_ranger_ssl_public_ca_certs_admin.html used to define the truststore to be used for the Ranger Admin service when connecting to other services. These are located at the of section "Advanced ranger-admin-site" as show here and should be changed to point to your system truststore (including the CA certificate used to sign the Hadoop Services certificates). Ranger -> Advanced ranger-admin-site So in order to make HTTPS and SSL work for Ranger Admin and Ranger Plugins in both directions you have to set correctly all the following fields pointing to the proper keystore (including private key) o truststore (including signing CA or certificate of the service you are going to connect): In Ranger -> Advanced ranger-admin-site
ranger.https.attrib.keystore.file = /etc/security/serverKeys/keystore.jks
ranger.service.https.attrib.keystore.pass = ******
... Other ranger.service.https.* releated properties
// Not documented in Security manual
ranger.truststore.file = /etc/security/serverKeys/truststore.jks
ranger.truststore.password = ******* In Ranger -> Advanced ranger-admin-site (this seems to be the same property above, so I suspect these are probably from different software version and only one is necessary but both are mentioned in the documentation so who knows?) ranger.service.https.attrib.keystore.file = /etc/security/serverKeys/keystore.jks In Service (HDFS/YARn) -> Advanced ranger-hdfs-policymgr-ssl (also set properties in Advanced ranger-hdfs-plugin-properties to match the certificate common name) // Keystore with the client certificate cn=hadoopclient,... xasecure.policymgr.clientssl.keystore = /etc/security/clientKeys/hadoopclient.jks
...
xasecure.policymgr.clientssl.truststore = /etc/security/clientKeys/truststore.jks
...
... View more
- « Previous
-
- 1
- 2
- Next »