Support Questions
Find answers, ask questions, and share your expertise

how to track the delete operations in HDFS?

Contributor

how to track / audit the delete operations like rm - rf performed on the HDFS files & directories?

1 ACCEPTED SOLUTION

Accepted Solutions

Explorer

Hello @hardik desai

You can get similar information from the /var/log/hadoop/hdfs/hdfs-audit.log log file. However rm would show up as a rename command. ls would show up as listStatus, etc. Here I create a file /tmp/deleteme.txt, upload it to HDFS, list it parent directory, and delete the file.

[hdfs@jyoung-hdp234-1 ~]$ echo "delete me" >> /tmp/deleteme.txt
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -put /tmp/deleteme.txt /tmp/
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -ls /tmp
Found 13 items
drwx------   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/ambari-qa
drwxrwxrwx   - oozie     hdfs          0 2016-12-19 08:32 /tmp/crime
-rw-r--r--   3 hdfs      hdfs         10 2017-01-18 08:29 /tmp/deleteme.txt
drwxr-xr-x   - hdfs      hdfs          0 2016-12-01 04:15 /tmp/entity-file-history
drwx-wx-wx   - ambari-qa hdfs          0 2016-12-11 18:36 /tmp/hive
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 04:16 /tmp/id1aacdf51_date160116
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 05:19 /tmp/id1aacdf51_date190116
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.pig
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.pig
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/tezsmokeinput
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 05:21 /tmp/tezsmokeoutput
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -rm /tmp/deleteme.txt
17/01/18 08:29:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://jyoung-hdp234-1.openstacklocal:8020/tmp/deleteme.txt' to trash at: hdfs://jyoung-hdp234-1.openstacklocal:8020/user/hdfs/.Trash/Current

You can see these operations in my hdfs-audit.log file below:

[root@jyoung-hdp234-1 hdfs]# grep "CLI" /var/log/hadoop/hdfs/hdfs-audit.log


2017-01-18 08:28:29,254 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,847 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,942 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,947 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,990 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=create	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,022 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,240 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt._COPYING_	dst=/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,129 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,247 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=listStatus	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,466 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/oldWALs	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,469 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/archive	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,457 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,552 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,570 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,585 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=mkdirs	src=/user/hdfs/.Trash/Current/tmp	dst=null	perm=hdfs:hdfs:rwx------	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,590 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/user/hdfs/.Trash/Current/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,595 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt	dst=/user/hdfs/.Trash/Current/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI

View solution in original post

2 REPLIES 2

Explorer

Hello @hardik desai

You can get similar information from the /var/log/hadoop/hdfs/hdfs-audit.log log file. However rm would show up as a rename command. ls would show up as listStatus, etc. Here I create a file /tmp/deleteme.txt, upload it to HDFS, list it parent directory, and delete the file.

[hdfs@jyoung-hdp234-1 ~]$ echo "delete me" >> /tmp/deleteme.txt
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -put /tmp/deleteme.txt /tmp/
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -ls /tmp
Found 13 items
drwx------   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/ambari-qa
drwxrwxrwx   - oozie     hdfs          0 2016-12-19 08:32 /tmp/crime
-rw-r--r--   3 hdfs      hdfs         10 2017-01-18 08:29 /tmp/deleteme.txt
drwxr-xr-x   - hdfs      hdfs          0 2016-12-01 04:15 /tmp/entity-file-history
drwx-wx-wx   - ambari-qa hdfs          0 2016-12-11 18:36 /tmp/hive
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 04:16 /tmp/id1aacdf51_date160116
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 05:19 /tmp/id1aacdf51_date190116
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.pig
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.pig
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/tezsmokeinput
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 05:21 /tmp/tezsmokeoutput
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -rm /tmp/deleteme.txt
17/01/18 08:29:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://jyoung-hdp234-1.openstacklocal:8020/tmp/deleteme.txt' to trash at: hdfs://jyoung-hdp234-1.openstacklocal:8020/user/hdfs/.Trash/Current

You can see these operations in my hdfs-audit.log file below:

[root@jyoung-hdp234-1 hdfs]# grep "CLI" /var/log/hadoop/hdfs/hdfs-audit.log


2017-01-18 08:28:29,254 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,847 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,942 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,947 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,990 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=create	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,022 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,240 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt._COPYING_	dst=/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,129 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,247 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=listStatus	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,466 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/oldWALs	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,469 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/archive	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,457 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,552 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,570 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,585 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=mkdirs	src=/user/hdfs/.Trash/Current/tmp	dst=null	perm=hdfs:hdfs:rwx------	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,590 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/user/hdfs/.Trash/Current/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,595 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt	dst=/user/hdfs/.Trash/Current/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI

View solution in original post

Contributor

@jyoung, i can relate to your answer and descreption...thanks!