Created 03-19-2018 08:19 PM
Hi all,
currently all the ranger audit logs of hive and hdfs goes to hdfs directories.
Is there any way to get the particular user audit logs of hive and hdfs to the local directory by making changes in the ambari.
Created 03-19-2018 08:26 PM
I assume you are referring to HDFS audit logs from Ranger. They will be stored in "/ranger/audit/<service>" folder.
If you want to see particular user's audit, you may have to store the HDFS audit into Hive table and analyze. See https://community.hortonworks.com/content/kbentry/60802/ranger-audit-in-hive-table-a-sample-approach...
Created 03-19-2018 08:41 PM
Thanks for quick response on this. Currently in my cluster all the ranger log files are stored under "/ranger/audit/<service>" directory.
Assume the user tom has access to the cluster and he is submitting majority of the jobs.
Is there any possiblity to get only tom's audit logs to the local directory?
Created 03-19-2018 09:18 PM
Ranger stores all audit logs in Solr (for showing in Ranger UI) and in HDFS (for long-term archive). Could you please elaborate on why what you are asking is required?
Created 03-20-2018 01:05 AM
{"repoType":4,"repo":"HDP_yarn","reqUser":"tom","evtTime":"2018-03-20 00:52:31.724","access":"SUBMIT_APP","resource":"root.default","resType":"queue","action":"submit-app","result":1,"policy":-1,"enforcer":"yarn-acl","cliIP":"10.142.0.2","reqData":"QuasiMonteCarlo","agentHost":"instance-1","logType":"RangerAudit","id":"7cc84516-68e4-44e5-aab5-ab1c38bdd21b-0","seq_num":1,"event_count":1,"event_dur_ms":0,"tags":[],"additional_info":"{\"remote-ip-address\":10.142.0.2, \"forwarded-ip-addresses\":[]","cluster_name":"HDP"}
{"repoType":4,"repo":"HDP_yarn","reqUser":"hdfs","evtTime":"2018-03-20 00:52:31.724","access":"SUBMIT_APP","resource":"root.default","resType":"queue","action":"submit-app","result":1,"policy":-1,"enforcer":"yarn-acl","cliIP":"10.142.0.2","reqData":"QuasiMonteCarlo","agentHost":"instance-1","logType":"RangerAudit","id":"7cc84516-68e4-44e5-aab5-ab1c38bdd21b-0","seq_num":1,"event_count":1,"event_dur_ms":0,"tags":[],"additional_info":"{\"remote-ip-address\":10.142.0.2, \"forwarded-ip-addresses\":[]","cluster_name":"HDP"} [hdfs@instance-1 ~]$ hdfs dfs -cat /ranger/audit/yarn/20180320/yarn_ranger_audit_instance-1.log
{"repoType":4,"repo":"HDP_yarn","reqUser":"tom","evtTime":"2018-03-20 00:52:31.724","access":"SUBMIT_APP","resource":"root.default","resType":"queue","action":"submit-app","result":1,"policy":-1,"enforcer":"yarn-acl","cliIP":"10.142.0.2","reqData":"QuasiMonteCarlo","agentHost":"instance-1","logType":"RangerAudit","id":"7cc84516-68e4-44e5-aab5-ab1c38bdd21b-0","seq_num":1,"event_count":1,"event_dur_ms":0,"tags":[],"additional_info":"{\"remote-ip-address\":10.142.0.2, \"forwarded-ip-addresses\":[]","cluster_name":"HDP"}
From the above logs is it possible to get the logs of the particular user i.e, tom to the local directory(as the second copy)?
Created 03-20-2018 01:12 AM
This functionality is not available. I am still not sure why this is required, because you can load the whole HDFS logs into a hive table and do analysis. If you'd like to store only a particular user's activity, you can perform the filtering when loading this data. If you store a second copy, it is just going to take more space.
Created 03-20-2018 01:19 AM
hosting team need all the logs related to their userId's. for example if any audit log is written to /ranger/audit/hiveServer2. then automatically that log need to save in the local directory as the second copy(not to delete from hdfs logs)