About tstebbens

tstebbens · ‎09-27-2017

Since Ranger 0.5 there has been the ability to summarize audit events that differ only by timestamp to reduce the amount of events logged in a busy system. When enabled, if a Ranger plugin logs consecutive audit events that differ only by timestamp it will coalesce all such events in to a single event and set 'event_count' to the number of events logged and 'event_dur_ms' to the time difference in milliseconds between the first and last event. To enable this feature you must set the following properties in the Ranger plugin's configuration: Configuration name Notes xasecure.audit.provider.summary.enabled To enable summarization set this property to true . This would cause audit messages to be summarized before they are sent to various sinks. By default it is set to false i.e. audit summarization is disabled. xasecure.audit.provider.queue.size If unspecified this value defaults to 1048576 , i.e. the queue is sized to store 1M (1024 * 1024) messages. Note the difference in property name that controls the size of summary queue. xasecure.audit.provider.summary.interval.ms The max time interval at which messages would be summarized. If unspecified it defaults to 5000 , i.e. 5 seconds. Summarization Batch size Note that regardless of this time interval while summarizing at most 100k messages at a time are considered for aggregation. Thus, if more than 100k messages are logged during this interval then similar messages could show up as multiple summarized audit messages even though they are logged within the configured time interval. Currently, this value of 100k is not user configurable. It is mentioned here for better understanding of Summarization logic. More details can be found here: Ranger 0.5 Audit log summarization

tstebbens · ‎09-19-2017

Technically, step 3 and step 4 are mutually exclusive. If you're using the Java cacerts then you don't need to set up a truststore for Ranger and vice-versa. If doing step 3, make sure you update the correct Java cacerts as the Ranger JVM is started with just the command 'java' (not the full path to java) so if you have both OpenJDK and Oracle JDK installed and your Hadoop JAVA_HOME is set to the Oracle JDK, Ranger will actually be started with OpenJDK if /etc/alternatives has not been updated. Also, 'rangertruststore' should probably be called 'rangertruststore.jks' for consistency.

tstebbens · ‎04-13-2017

When trying to add a policy that has many resource paths to Ranger using the API it can fail with the error Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value for column 'sort_order' at row 1 Error Code: 1264 Call: INSERT INTO x_policy_resource_map (ADDED_BY_ID, CREATE_TIME, sort_order, resource_id, UPDATE_TIME, UPD_BY_ID, value) VALUES (?, ?, ?, ?, ?, ?, ?) bind => [7 parameters bound] Query: InsertObjectQuery(XXPolicyResourceMap [XXDBBase={createTime={Thu Apr 13 11:42:38 UTC 2017} updateTime={Thu Apr 13 11:42:39 UTC 2017} addedByUserId={1} updatedByUserId={1} } id=null, resourceId=43, value=/tmp/129, order=128]) This is caused by a limit in Ranger policies that can only contain a maximum of 128 resource paths in a single policy. The work-around would be to split the policy in to two or more policies each containing less that 128 resource paths.

tstebbens · ‎02-22-2017

@Nigel Jones Ranger is not available to install using the cluster install wizard normally. You have to install the cluster first, then install Ranger once the cluster is up and running. I'm not sure of the exact reason but I suspect it is because it would not be set up properly (users and groups not sync'd, etc.) and then set up of services would fail because of permissions issues.

tstebbens · ‎02-03-2017

@rbailey No, technically they don't need a group associated with them. Also they don't need to be able to login to any systems. As long as there is a principal in Kerberos for them and they can authenticate against the KDC you should be okay. As per the answer in the other article you linked to I usually just create a single 'rangerlookup' user and principal to be used by all the services.

tstebbens · ‎01-31-2017

@Dinesh Das Try running the chmod command as user 'hdfs': su - hdfs -c 'hdfs dfs -chmod -R 700 /user/hive' In HDFS, 'root' doesn't have any special access but the user 'hdfs' is considered a super-user so can read/write any file.

tstebbens · ‎01-23-2017

@shashi kumar The URL looks okay - try doing a curl directly to the ResourceManager (i.e. without Knox) to verify that it is working as expected. This will eliminate YARN as the issue. The error 'SSL23_GET_SERVER_HELLO:unknown protocol' looks like there is an issue establishing an SSL connection to Knox so I think this is the source of your issue. Check that the Knox server is set up correctly and all the certificates are working properly.

tstebbens · ‎01-23-2017

@sreehari takkelapati Further to apappu's answer, if you're using an HDP version prior to 2.5.0 then the table you want in Ranger's database is xa_access_audit, but as this is now deprecated and no longer used I wouldn't build any processes around that. Instead you will find that, provided your system is configured correctly, Ranger audit logs will be written to HDFS (under /ranger/audit/<component name>) and/or Solr (in Ambari Infra.) The Solr copy is easy to query to get the results you want provided you know how to write Solr queries, but it only indexes the last 30 days of audit records. The HDFS copy stores all auditing events unless you explicitly delete them. The audit events are stored in JSON format and the fields are fairly self-explanatory. This is an example from Hiveserver2: {"repoType":3,"repo":"hdp250_hive","reqUser":"usera","evtTime":"2016-11-24 04:08:10.179","access":"UPDATE","resource":"z_ssbi_hive_tdzm/imei_table","resType":"@table","action":"update","result":1,"policy":19,"enforcer":"ranger-acl","sess":"b87d8c0e-920f-4a62-8c44-82d7521a1b96","cliType":"HIVESERVER2","cliIP":"10.0.2.36","reqData":"INSERT INTO z_ssbi_hive_tdzm.imei_table PARTITION (partkey\u003d\u00271\u0027)\nSELECT COUNT(*) FROM default.imei_staging_table \nUNION ALL \nSELECT COUNT(*) FROM default.imei_staging_table","agentHost":"hdp250.local","logType":"RangerAudit","id":"d27e1496-08cc-4dad-a6ba-f87736b44a13-26","seq_num":53,"event_count":1,"event_dur_ms":0,"tags":[],"additional_info":"{\"remote-ip-address\":10.0.2.36, \"forwarded-ip-addresses\":[]"} You will need to read these in, parse the JSON and total up the access using a script. It should be fairly easy to write this in something like Perl or Python.

tstebbens · ‎01-20-2017

@Dinesh Das The HDP Sandbox 2.5 comes with Knox that includes a demo LDAP server which should be sufficient for testing purposes. You can start and stop this from Ambari under Knox > Service Actions. In the Knox configuration is a section called 'Advanced users-ldif' which contains the LDIF data loaded by the demo LDAP server. You can add users and groups to this LDIF, save the configuration and then restart the demo LDAP server. If you're not familiar with LDIF then the template to add a user is something like: dn: uid=<username>,ou=people,dc=hadoop,dc=apache,dc=org objectclass: top objectclass: person objectclass: organizationalPerson objectclass: inetOrgPerson cn: <common name, e.g. Joe Bloggs> sn: <surname, e.g. Bloggs> uid: <username> userPassword: <password> Replace <username> with the username you want to add, <common name, e.g. Joe Bloggs> with the full name of the user, <surname, e.g. Bloggs> with the surname of the user, and <password> with the password you want. Similarly for groups, use the template dn: cn=<groupname>,ou=groups,dc=hadoop,dc=apache,dc=org objectclass:top objectclass: groupofnames cn: <groupname> member: uid=<username>,ou=people,dc=hadoop,dc=apache,dc=org Replace <groupname> with the group name you want and add and as many of the member: lines as you want to add users to the group, e.g. member: uid=user_a,ou=people,dc=hadoop,dc=apache,dc=org member: uid=user_b,ou=people,dc=hadoop,dc=apache,dc=org member: uid=user_c,ou=people,dc=hadoop,dc=apache,dc=org Configuring your OS to read these users and groups from the demo LDAP server is quite complex - you'll need a lot more information in the LDIF file to support this and to configure PAM/NSS to talk to the LDAP server so for your purposes I'd stick to using 'adduser' and 'addgroup' to add all the users and groups you want to the OS manually. Once you've added the users and groups you want and started the demo LDAP you can use the instructions here to connect Ambari up with the demo LDAP server: https://community.hortonworks.com/questions/2838/has-anyone-integrated-for-demo-purposes-only-the-k.html For Ranger, you should also leave it syncing users from the OS (the default configuration) as you will have used 'adduser' and 'addgroup' to add all the users to the OS so Ranger will automatically sync these for you. If you really want to sync the users from the demo LDAP server then you'll need the set the following properties for Ranger Admin and Ranger Usersync. Note that I haven't tried this so it may not work and you may need to experiment with some of the settings. Ranger: ranger.ldap.base.dn=dc=hadoop,dc=apache,dc=org ranger.ldap.bind.dn=uid=admin,ou=people,dc=hadoop,dc=apache,dc=org ranger.ldap.bind.password=admin-password ranger.ldap.group.roleattribute=cn ranger.ldap.group.searchbase=ou=groups,dc=hadoop,dc=apache,dc=org ranger.ldap.group.searchfilter=(member=uid={0},ou=people,dc=hadoop,dc=apache,dc=org) ranger.ldap.referral=follow ranger.ldap.url=ldap://localhost:33389 ranger.ldap.user.dnpattern=uid={0},ou=people,dc=hadoop,dc=apache,dc=org ranger.ldap.user.searchfilter=(uid={0}) UserSync: ranger.usersync.group.memberattributename=member ranger.usersync.group.nameattribute=cn ranger.usersync.group.objectclass=groupofnames ranger.usersync.group.search.first.enabled=false ranger.usersync.group.searchbase=ou=groups,dc=hadoop,dc=apache,dc=org ranger.usersync.group.searchenabled=true ranger.usersync.group.searchfilter= ranger.usersync.group.searchscope=sub ranger.usersync.group.usermapsyncenabled=true ranger.usersync.ldap.binddn=uid=admin,ou=people,dc=hadoop,dc=apache,dc=org ranger.usersync.ldap.groupname.caseconversion=none ranger.usersync.ldap.ldapbindpassword=admin-password ranger.usersync.ldap.referral=follow ranger.usersync.ldap.searchBase=dc=hadoop,dc=apache,dc=org ranger.usersync.ldap.url=ldap://localhost:33389 ranger.usersync.ldap.user.groupnameattribute=memberof,ismemberof ranger.usersync.ldap.user.nameattribute=uid ranger.usersync.ldap.user.objectclass=person ranger.usersync.ldap.user.searchbase=ou=people,dc=hadoop,dc=apache,dc=org ranger.usersync.ldap.user.searchfilter= ranger.usersync.ldap.user.searchscope=sub ranger.usersync.ldap.username.caseconversion=none

tstebbens · ‎01-20-2017

@Dinesh Das In Ambari you can add users and groups manually - click on the 'admin' button at the top right, select 'Manage Ambari' and then click on either Users or Groups and then the 'Create Local User/Group' button. These users only exist in Ambari, not in the OS or in Ranger. Alternatively you can configure Ambari to pull users and groups from LDAP/Active Directory - see Configuring Ambari for LDAP or Active Directory Authentication If you want to be able to 'su' to the user in the OS then you'll need to configure your OS to also read the users from LDAP/Active Directory or manually add them to your OS using 'adduser' and 'addgroup'. Ranger can synchronize your users and groups either from the OS or from LDAP/Active Directory - see Advanced Usersync Settings. The best choice is to sync all three - OS, Ambari and Ranger - from LDAP/Active Directory. That way you ensure that all users and groups exist in all three components.

Online	Offline
Last Visited	‎03-12-2019 04:30 PM

Member Since	‎09-11-2015 10:26 AM
Last Visited	‎03-12-2019 04:30 PM
Posts	41
Kudos received	48

Cloudera Community

Re: Ranger Lookup User in Kerberos -> Does the loc...

Re: Need help on user and its access in HDFS for H...

Re: Confusions between Ambari user/group aand Rang...

Re: Symbolic links with hdfs commands

Re: Tez job hang, waiting for AM container to be a...

Ranger audit log event summarization

Re: Setup Ranger to use Ambari Infra Solr enabled ...

Adding a policy to Ranger using the API fails with...

Re: Ranger not appearing as install in Ambari

Re: Ranger Lookup User in Kerberos -> Does the loc...

Re: Need help on user and its access in HDFS for H...

Re: How to request a applicationID using YARN REST...

Re: custom ranger reports

Re: Confusions between Ambari user/group aand Rang...

Re: Confusions between Ambari user/group aand Rang...