Created 01-19-2017 01:37 PM
Created 01-19-2017 01:38 PM
hdfs-audit.log will have that info.
Created 01-19-2017 01:43 PM
@Sankar T Also Ranger audit logs if you have it installed and have the HDFS plugin enabled. In general if you're worried about who does what on your system then you should consider using Ranger at least and possibly Atlas as well.
Created 01-20-2017 02:12 PM
..and if it was done from the command line, it shouldn't have been deleted, it should have been moved to the .Trash folder of the user
Created 03-09-2017 05:18 PM
You can see /grid/0/log/hdfs/hdfs/hdfs-audit.log(if ambari installed cluster) for finding who deleted the directory or file in hdfs
2017-03-09 00:04:18,495 INFO FSNamesystem.audit: allowed=true ugi=ambari-qa-cl1@EXAMPLE.COM (auth:KERBEROS) ip=/172.xx.xx.xx cmd=delete src=/tmp/hive/ambari-qa/388b15de-5e3f-4b7a-8069-d939b64e513e dst=null perm=null proto=rpc
Created 03-13-2017 06:49 PM
Yes, the audit log will serve the purpose. Note that, in some cases, it is not straightforward to search the log for deletion since a directory (or a file) may not be deleted directly -- it may be deleted as a part of the deletion of its parent/ancestor directory. So we should first search the full path in the log. If it is not found, search the parent directory path and so on.
It will be more complicated if deletion and re-creation occurred repeatedly. For example
1) user A: create /foo
2) user A: create /foo/bar
3) user A: del /foo
4) user B: create /foo
5) user B: del /foo
Who has deleted /foo/bar? It is easy to mistakenly take user B as the answer. B is the last user deleted foo but B is not the user deleted /foo/bar. In such case, we should first determine when the target directory/file is created and then search what happened of it starting from the creation time.
You can imagine that it is even harder to find out the correct answer if the path or the parent/ancestor paths are moved/renamed. We need to pay extra attention if the rename operation is involved.