Created 01-20-2017 09:40 PM
Hi,
We have a requirement to get custom audit reports out of ranger to get
1) how many times a table is being accessed per month
2) how many times a user accessed a table per month.
Can you please let me know how to get this info from ranger.
I checked the ranger mysql database to check if i can get this info. I am not able to find it
Thanks,
Sree
Created 01-23-2017 11:58 PM
@sreehari takkelapati you can bump the audit data in hdfs into hive table and do a report as needed for you. Please check this out.https://community.hortonworks.com/articles/60802/ranger-audit-in-hive-table-a-sample-approach-1.html
Created 01-20-2017 11:51 PM
From HDP2.5, Ranger does not support audit to DB - so not sure looking at the DB is for audit data is good idea.
Created 01-23-2017 09:26 AM
@sreehari takkelapati Further to apappu's answer, if you're using an HDP version prior to 2.5.0 then the table you want in Ranger's database is xa_access_audit, but as this is now deprecated and no longer used I wouldn't build any processes around that.
Instead you will find that, provided your system is configured correctly, Ranger audit logs will be written to HDFS (under /ranger/audit/<component name>) and/or Solr (in Ambari Infra.) The Solr copy is easy to query to get the results you want provided you know how to write Solr queries, but it only indexes the last 30 days of audit records. The HDFS copy stores all auditing events unless you explicitly delete them.
The audit events are stored in JSON format and the fields are fairly self-explanatory. This is an example from Hiveserver2:
{"repoType":3,"repo":"hdp250_hive","reqUser":"usera","evtTime":"2016-11-24 04:08:10.179","access":"UPDATE","resource":"z_ssbi_hive_tdzm/imei_table","resType":"@table","action":"update","result":1,"policy":19,"enforcer":"ranger-acl","sess":"b87d8c0e-920f-4a62-8c44-82d7521a1b96","cliType":"HIVESERVER2","cliIP":"10.0.2.36","reqData":"INSERT INTO z_ssbi_hive_tdzm.imei_table PARTITION (partkey\u003d\u00271\u0027)\nSELECT COUNT(*) FROM default.imei_staging_table \nUNION ALL \nSELECT COUNT(*) FROM default.imei_staging_table","agentHost":"hdp250.local","logType":"RangerAudit","id":"d27e1496-08cc-4dad-a6ba-f87736b44a13-26","seq_num":53,"event_count":1,"event_dur_ms":0,"tags":[],"additional_info":"{\"remote-ip-address\":10.0.2.36, \"forwarded-ip-addresses\":[]"}
You will need to read these in, parse the JSON and total up the access using a script. It should be fairly easy to write this in something like Perl or Python.
Created 01-23-2017 11:58 PM
@sreehari takkelapati you can bump the audit data in hdfs into hive table and do a report as needed for you. Please check this out.https://community.hortonworks.com/articles/60802/ranger-audit-in-hive-table-a-sample-approach-1.html
Created 01-24-2017 12:49 AM
Thank you Ramesh, Terry and apappu. I am able to fulfill my requirement from hiveServer2 logs of hdfs