About rjain1

ElephantAdmin · ‎10-19-2020

Try below. Some times the ambari cluster environment variable security_enabled might still hold the value true and hence all services expect keytabs . To validate the value of the environment variable /var/lib/ambari-server/resources/scripts/configs.py -a get -l <ambari-server host> -t 8080 -n <cluster-name> -u <admin-user> -p <admin-password> -c cluster-env | grep security "security_enabled": "true", "smokeuser_keytab": "/etc/security/keytabs/smokeuser.headless.keytab" /var/lib/ambari-server/resources/scripts/configs.py -a set -k security_enabled -v false -l <ambari-server host> -t 8080 -n <cluster name> -u <admin user> -p <admin password> -c cluster-env Try setting that variable to false

Lingesh · ‎11-20-2019

Rightly said. Above SSSD config change will help here. Along with above SSSD change and restart, don't forget to restart the involved Hadoop daemons (like Nodemanager etc). This is needed to rebuild the in-memory cache which holds the UID -> Username mapping for up to 4 hours without invalidation [1] [1]: Ref: org.apache.hadoop.fs.CommonConfigurationKeys ---- public static final String HADOOP_SECURITY_UID_NAME_CACHE_TIMEOUT_KEY = "hadoop.security.uid.cache.secs"; public static final long HADOOP_SECURITY_UID_NAME_CACHE_TIMEOUT_DEFAULT = 4*60*60; // 4 hours ----

rjain1 · ‎07-22-2018

Recently I come across this command line Ldap connection tool, which is very useful while setting Ranger UserSync. This tool collects minimal input from admin about the ldap/AD server and discovers various properties for users and groups in order to successfully pull only targeted Users and Groups from the Ldap/AD server. Details Ldap Connection check tool is a command line tool and can be run on any machine where Java is installed and Ldap/AD server access is available. This tool can be used to discover not only user sync related properties but also authentication properties if needed. It also generates ambari configuration properties as well as install properties for manual installation. User is also provided an option to discover both the user and group properties together or separately. As part of the tool, a template properties file is provided for the user to update the values specific to the setup. Tool usage In order to learn details on how to use the tool, the tool also provides an “help” option (-h) as follows: usage: run.sh -a ignore authentication properties -d <arg> {all|users|groups} -h show help. -i <arg> Input file name -o <arg> Output directory -r <arg> {all|users|groups} All these above parameters are optional. If “-i” (for input file) is not specified, the tool will fall back to CLI option for collecting values for mandatory properties if “-o” (for output directory) is not specified, the tool will write all the output files to the <install dir>/ranger-0.5.0-usersync/ldaptool/output directory if “-a” (for ignoring authentication) is not specified, the tool will discovery & verify authentication related properties. if “-d” (for discovering usersync properties) is not specified, the tool will default to discovering all the usersync related properties if “-r” (for retrieving users and/or groups) is not specified, the tool will fallback to “-d” option. Example Input properties In order to discover the usersync and authentication related properties, tool collects some mandatory information as part of the input properties. These Mandatory properties include: Mandatory properties include: 1.ranger.usersync.ldap.url (<ldap or ldaps>://<server ip/fqdn>:<port> 2.ranger.usersync.ldap.binddn (ldap user like AD user or ldap admin user) 3.ranger.usersync.ldap.bindpassword (user password or ldap admin password) 4. ranger.usersync.ldap.user.searchbase (Mandatory only for non AD environment) 5. ranger.usersync.ldap.user.searchfilter (Mandatory only for non AD environment) 6. ranger.admin.auth.sampleuser (Mandatory only for discovering authentication properties) 7. ranger.admin.auth.samplepassword (Mandatory only for discovering authentication properties) This tool provides two options for collecting values for these mandatory properties: Modify the input.properties file provided as part of the tool installation and provide that file (with complete path as the command line argument while running the tool. Use CLI to input the values for these mandatory properties. CLI option is provided to the user when the input file is not provided as the command line option (-i <arg>) while running the tool. Once the values are collected from the CLI, these values are stored in the input.properties file (in the conf dir of the installation folder) for later use. Following is the CLI provided by the tool when input file is not specified: Ldap url [ldap://ldap.example.com:389]: Bind DN [cn=admin,ou=users,dc=example,dc=com]: Bind Password: User Search Base [ou=users,dc=example,dc=com]: User Search Filter [cn=user1]: Sample Authentication User [user1]: Sample Authentication Password: Note:- In order to use secure ldap, the java default truststore must be updated with the server’s self signed certificate or the CA certificate for validating the server connection. The truststore should be updated before running the tool.

rjain1 · ‎07-17-2018

Tokens are wire-serializable objects issued by Hadoop services, which grant access to services. Some services issue tokens to callers which are then used by those callers to directly interact with other services without involving the KDC at all. Block Tokens A BlockToken is the token issued for access to a block; it includes: (userId, (BlockPoolId, BlockId), keyId, expiryDate, access-modes) Block Keys Key used for generating and verifying block tokens. Block Keys are managed in the BlockTokenSecretManager, one in the NN and another in every DN to track the block keys to which it has access. How this Works: 1. Client asks NN for access to a path, identifying via Kerberos or delegation token. 2. Client talks to DNs with the block, using the Block Token. 3. DN authenticates Block Token using shared-secret with NameNode. 4. if authenticated, DN compares permissions in Block Token with operation requested, then grants or rejects the request. The client does not have its identity checked by the DNs. That is done by the NN. This means that the client can in theory pass a Block Token on to another process for delegated access to a single block. These HDFS Block Tokens do not contain any specific knowledge of the principal running the Datanodes, instead they declare that the caller has stated access rights to the specific block, up until the token expires. public class BlockTokenIdentifier extends TokenIdentifier { static final Text KIND_NAME = new Text("HDFS_BLOCK_TOKEN"); private long expiryDate; private int keyId; private String userId; private String blockPoolId; private long blockId; private final EnumSet<AccessMode> modes; private byte [] cache; ... To enable the NameNode block access token, configure the following settings in the hdfs-site.xml file: dfs.block.access.token.enable=yes dfs.block.access.key.update.interval=600 (by default, minutes) dfs.block.access.token.lifetime=600 (by default, minutes) General Error Seen: 2015-09-22 12:55:48,271 WARN [regionserver60020-smallCompactions-1432895622947] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x1102b41c): could not load 1074240550_BP-607492251-xxx.xxx.xxx.xxx-1427711497172 due to InvalidToken exception. org.apache.hadoop.security.token.SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to /apps/hbase/data/data/default/blah/b83abaf5631c4ce18c9da7eaf569bb3b/t/bbb2436ed50e471e8645f8bd402902e3Block token with block_token_identifier (expiryDate=1442911790388, keyId=286785309, userId=hbase, blockPoolId=BP-607492251-xx.xx.xx.xx-1427711497172, blockId=1074240550, access modes=[READ]) is expired. Root Cause: The block token access is expire and become invalid 2018-07-15 17:49:25,649 WARN datanode.DataNode (DataXceiver.java:checkAccess(1311)) - Block token verification failed: op=WRITE_BLOCK, remoteAddress=/10.10.10.100:0000, message=Can't re-compute password for block_token_identifier (expiryDate=1501487365624, keyId=127533694, userId=RISHI, blockPoolId=BP-2019616911-10.211.159.22-1464205355083, blockId=1305095824, access modes=[WRITE]), since the required block key (keyID=127533694) doesn't exist. Root Cause : This can be seen when a client connection fails because the client has presented a block access token that references a block key that does not exist in DataNode. To solve this restart the dataNode

rjain1 · ‎06-30-2018

Hadoop relies heavily on DNS, and as such performs many DNS lookups during normal operation. To reduce the load on your DNS infrastructure, it's highly recommended to use the Name Service Caching Daemon (NSCD) on cluster nodes running Linux or any other dns caching mechanism ( i.e dnsmasq ). This daemon will cache host, user, and group lookups and provide better resolution performance, and reduced load on DNS infrastructure. Default cache size and TTL value are good enough to reduce the significant load. However, you might need to tweak this as per your environment.

rjain1 · ‎05-30-2018

The Key Distribution Center (KDC) is available as part of the domain controller and performs two key functions which are: Authentication Service (AS) and Ticket-Granting Service (TGS) By default the KDC requires all accounts to use pre-authentication. This is a security feature which offers protection against password-guessing attacks. The AS request identifies the client to the KDC in plain text. If pre-authentication is enabled, a time stamp will be encrypted using the user's password hash as an encryption key. If the KDC reads a valid time when using the user's password hash, which is available in the Active Directory, to decrypt the time stamp, the KDC knows that request isn't a replay of a previous request. When you do not enforce pre-authentication, a malicious attacker can directly send a dummy request for authentication. The KDC will return an encrypted TGT and the attacker can brute force it offline. Upon checking the KDC logs, nothing will be seen except a single request for a TGT. When Kerberos timestamp pre-authentication is enforced, the attacker cannot directly ask the KDCs for the encrypted material to brute force offline. The attacker has to encrypt a timestamp with a password and offer it to the KDC. The attacker can repeat this over and over. However, the KDC log will record the entry every time the pre-authentication fails. Hence you should never disable preauth in kerberos.

rjain1 · ‎05-13-2018

Can not start run livy jobs from Zeppelin The following can be seen in livy log file: INFO: Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://rangerkms.example.com:9292/kms/v1/?op=GETDELEGATIONTOKEN&doAs=w20524%4542test.tt&renewer=r m%2Fsktudv01hdp02.test.tt%40CCTA.DK&user.name=livy, status: 403, message: Forbidden Root Cause: Missing proxy users for livy in kms. The solution is to add the following into Ambari Custom kms-site hadoop.kms.proxyuser.livy.users=* hadoop.kms.proxyuser.livy.hosts=* hadoop.kms.proxyuser.livy.groups=*

Pankaj.1207 · ‎07-10-2018

Hi Rishi, I have been trying to configure ssl using the above steps. I am getting an error on the 3rd step:- # keytool -import -file zeppelin.crt -keystore zeppelin-keystore.jks Enter keystore password: keytool error: java.io.FileNotFoundException: zeppelin.crt (No such file or directory) I have noticed that in the 2nd step "zeppelin.csr" certificate has been created and in the 3rd steps we are importing "zeppelin.crt". Do we need to perform any other steps before the 3rd step to convert the certificate from .csr to .crt? Also, I tried creating the certificate name as ".crt" in 2nd step and importing it in the 3rd step as below, but getting a different error:- # keytool -import -file zeppelin.crt -keystore zeppelin-keystore.jks Enter keystore password: keytool error: java.lang.Exception: Input not an X.509 certificate Could you please help Thanks.

rjain1 · ‎01-18-2018

Ranger plugins send their audit event (whether access was granted or not and based on the policy) directly to the configured sink for audits, which can be HDFS, Solr or both. Ranger Audit is a highly customizable event queue system that can be tailored to suit the needs of production environments. When the plugin is enabled and no specific policy is in place for access to some object, the plugin will fall back to enforcing the standard component level Access Control Lists (ACL’s). For HDFS, that would be the user: rwx / group: rwx / other: rwx ACL’s on folders and files. Once this defaulting to component ACL’s happens, the audit events show a ‘ - ‘ in the ‘Policy ID’ column instead of a policy number. If a Ranger policy was in control of allowing/ denying, the policy number is shown. Key Things to Remember Access decisions taken by Ranger (to allow/ deny user) are based on a combination of three things: resource - that is being accessed user/group - who is trying to access operation - that is being performed The Audit decision taken by Ranger (whether to audit or not) are based on a matching resource. That is, if there is a policy that allows audit for a certain resource, then the audit will be performed irrespective of whether that policy is governing access policy or not. Now, based on #1 and #2 above, depending on the policy configuration, it is very much possible that access decision is taken by policy X, but audit decision is taken by policy Y. Note: Sometimes this may seem confusing that audit events show an X in the Policy ID column even though the audit is disabled for X. Remember that the Policy ID column decided on access decision, but audit decision is coming from another policy. How to Troubleshoot Ranger Audit issue? Enable the Ranger plugin debug and restart the host service again to get to the root cause of the error. To get further granular level behavior and to understand enabling policyengine and policyevaluator, debug as follows: Example: The following log4j lines will change based on the host service and log4j module used in that service: log4j.logger.org.apache.ranger.authorization.hbase=DEBUG log4j.logger.org.apache.ranger.plugin.policyengine=DEBUG log4j.logger.org.apache.ranger.plugin.policyevaluator=DEBUG

krajguru · ‎07-03-2017

@Rishi Currently if your cluster in not kerberised, any user can just export the HADOOP_USER_NAME variable and can perform any activities., there is no way to restrict that. For example : [kunal@s261 ~]$ hdfs dfs -ls /mapred Found 1 items drwxr-xr-x - hdfs hdfs 0 2017-04-24 11:33 /mapred/system [kunal@s261 ~]$ hdfs dfs -ls /mapred/system [kunal@s261 ~]$ [kunal@s261 ~]$ [kunal@s261 ~]$ [kunal@s261 ~]$ hdfs dfs -rmr /mapred/system rmr: DEPRECATED: Please use 'rm -r' instead. 17/04/26 14:30:56 WARN fs.TrashPolicyDefault: Can't create trash directory: hdfs://s261.openstacklocal:8020/user/kunal/.Trash/Current/mapred org.apache.hadoop.security.AccessControlException: Permission denied: user=kunal, access=WRITE, inode="/user/kunal/.Trash/Current/mapred":hdfs:hdfs:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) Then if you export the above variable, we can delete the file : [kunal@s261 ~]$ export HADOOP_USER_NAME=hdfs [kunal@s261 ~]$ [kunal@s261 ~]$ [kunal@s261 ~]$ hdfs dfs -rmr /mapred/system rmr: DEPRECATED: Please use 'rm -r' instead. 17/04/26 14:31:15 INFO fs.TrashPolicyDefault: Moved: 'hdfs://s261.openstacklocal:8020/mapred/system' to trash at: hdfs://s261.openstacklocal:8020/user/hdfs/.Trash/Current/mapred/system The only way is to setup kerberos which can fix this issue, even if you export the variable the user is derived from the kerberos principal : [root@krajguru-e1 ~]# kinit kunal Password for kunal@LAB.HORTONWORKS.NET: [root@krajguru-e1 ~]# [root@krajguru-e1 ~]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: kunal@LAB.HORTONWORKS.NET Valid starting Expires Service principal 07/03/2017 12:24:39 07/03/2017 22:24:39 krbtgt/LAB.HORTONWORKS.NET@LAB.HORTONWORKS.NET renew until 07/10/2017 12:24:34 [root@krajguru-e1 ~]# [root@krajguru-e1 ~]# hdfs dfs -ls /mapred/ Found 1 items drwxr-xr-x - hdfs hdfs 0 2017-04-21 11:47 /mapred/system [root@krajguru-e1 ~]# [root@krajguru-e1 ~]# export HADOOP_USER_NAME=hdfs [root@krajguru-e1 ~]# [root@krajguru-e1 ~]# hdfs dfs -rmr /mapred/system rmr: DEPRECATED: Please use 'rm -r' instead. 17/07/03 12:25:11 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes. rmr: Failed to move to trash: hdfs://e1.openstacklocal:8020/mapred/system: Permission denied: user=kunal, access=WRITE, inode="/mapred/system":mapred:hdfs:drwxr-xr-x

Online	Offline
Last Visited	‎05-08-2020 12:58 PM

Member Since	‎08-29-2016 09:59 AM
Last Visited	‎05-08-2020 12:58 PM
Posts	40
Kudos received	5

Cloudera Community

Re: ranger hive policy configuration

Re: Knox Hive HA Configuration Does Not Work in HD...

Re: Can't restart services after removing kerberos...

Re: Cannot read log files on applications started ...

Discover Ldap properties for UserSync module : Ran...

The Untold Story of Block Access Token

Should I need to enable DNS caching for Hadoop Clu...

kinit: Preauthentication failed while getting init...

Cannot run livy jobs from Zeppelin in cluster with...

Re: Configure SSL for Zeppelin UI

How Ranger Audit Works

Re: How to prevent users from modifying HADOOP_USE...