Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to track the delete operations in HDFS?

Solved Go to solution

how to track the delete operations in HDFS?

Contributor

how to track / audit the delete operations like rm - rf performed on the HDFS files & directories?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: how to track the delete operations in HDFS?

New Contributor

Hello @hardik desai

You can get similar information from the /var/log/hadoop/hdfs/hdfs-audit.log log file. However rm would show up as a rename command. ls would show up as listStatus, etc. Here I create a file /tmp/deleteme.txt, upload it to HDFS, list it parent directory, and delete the file.

[hdfs@jyoung-hdp234-1 ~]$ echo "delete me" >> /tmp/deleteme.txt
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -put /tmp/deleteme.txt /tmp/
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -ls /tmp
Found 13 items
drwx------   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/ambari-qa
drwxrwxrwx   - oozie     hdfs          0 2016-12-19 08:32 /tmp/crime
-rw-r--r--   3 hdfs      hdfs         10 2017-01-18 08:29 /tmp/deleteme.txt
drwxr-xr-x   - hdfs      hdfs          0 2016-12-01 04:15 /tmp/entity-file-history
drwx-wx-wx   - ambari-qa hdfs          0 2016-12-11 18:36 /tmp/hive
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 04:16 /tmp/id1aacdf51_date160116
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 05:19 /tmp/id1aacdf51_date190116
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.pig
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.pig
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/tezsmokeinput
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 05:21 /tmp/tezsmokeoutput
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -rm /tmp/deleteme.txt
17/01/18 08:29:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://jyoung-hdp234-1.openstacklocal:8020/tmp/deleteme.txt' to trash at: hdfs://jyoung-hdp234-1.openstacklocal:8020/user/hdfs/.Trash/Current

You can see these operations in my hdfs-audit.log file below:

[root@jyoung-hdp234-1 hdfs]# grep "CLI" /var/log/hadoop/hdfs/hdfs-audit.log


2017-01-18 08:28:29,254 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,847 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,942 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,947 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,990 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=create	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,022 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,240 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt._COPYING_	dst=/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,129 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,247 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=listStatus	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,466 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/oldWALs	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,469 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/archive	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,457 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,552 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,570 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,585 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=mkdirs	src=/user/hdfs/.Trash/Current/tmp	dst=null	perm=hdfs:hdfs:rwx------	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,590 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/user/hdfs/.Trash/Current/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,595 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt	dst=/user/hdfs/.Trash/Current/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2 REPLIES 2
Highlighted

Re: how to track the delete operations in HDFS?

New Contributor

Hello @hardik desai

You can get similar information from the /var/log/hadoop/hdfs/hdfs-audit.log log file. However rm would show up as a rename command. ls would show up as listStatus, etc. Here I create a file /tmp/deleteme.txt, upload it to HDFS, list it parent directory, and delete the file.

[hdfs@jyoung-hdp234-1 ~]$ echo "delete me" >> /tmp/deleteme.txt
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -put /tmp/deleteme.txt /tmp/
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -ls /tmp
Found 13 items
drwx------   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/ambari-qa
drwxrwxrwx   - oozie     hdfs          0 2016-12-19 08:32 /tmp/crime
-rw-r--r--   3 hdfs      hdfs         10 2017-01-18 08:29 /tmp/deleteme.txt
drwxr-xr-x   - hdfs      hdfs          0 2016-12-01 04:15 /tmp/entity-file-history
drwx-wx-wx   - ambari-qa hdfs          0 2016-12-11 18:36 /tmp/hive
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 04:16 /tmp/id1aacdf51_date160116
-rwxr-xr-x   3 hdfs      hdfs       1616 2016-12-01 05:19 /tmp/id1aacdf51_date190116
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 04:21 /tmp/idtest.ambari-qa.1480566109.56.pig
-rwxr-xr-x   3 ambari-qa hdfs       1616 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-12-01 05:23 /tmp/idtest.ambari-qa.1480569805.86.pig
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 04:19 /tmp/tezsmokeinput
drwxr-xr-x   - ambari-qa hdfs          0 2016-12-01 05:21 /tmp/tezsmokeoutput
[hdfs@jyoung-hdp234-1 ~]$ hdfs dfs -rm /tmp/deleteme.txt
17/01/18 08:29:46 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://jyoung-hdp234-1.openstacklocal:8020/tmp/deleteme.txt' to trash at: hdfs://jyoung-hdp234-1.openstacklocal:8020/user/hdfs/.Trash/Current

You can see these operations in my hdfs-audit.log file below:

[root@jyoung-hdp234-1 hdfs]# grep "CLI" /var/log/hadoop/hdfs/hdfs-audit.log


2017-01-18 08:28:29,254 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,847 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,942 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,947 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:13,990 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=create	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,022 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt._COPYING_	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:14,240 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt._COPYING_	dst=/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,129 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,247 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=listStatus	src=/tmp	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,466 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/oldWALs	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:27,469 INFO FSNamesystem.audit: allowed=true	ugi=hbase/jyoung-hdp234-3.openstacklocal@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.225	cmd=listStatus	src=/apps/hbase/data/archive	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,457 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,552 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,570 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,585 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=mkdirs	src=/user/hdfs/.Trash/Current/tmp	dst=null	perm=hdfs:hdfs:rwx------	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,590 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=getfileinfo	src=/user/hdfs/.Trash/Current/tmp/deleteme.txt	dst=null	perm=null	proto=rpc	callerContext=CLI
2017-01-18 08:29:46,595 INFO FSNamesystem.audit: allowed=true	ugi=hdfs-cluster1@EXAMPLE.COM (auth:KERBEROS)	ip=/172.26.81.223	cmd=rename	src=/tmp/deleteme.txt	dst=/user/hdfs/.Trash/Current/tmp/deleteme.txt	perm=hdfs:hdfs:rw-r--r--	proto=rpc	callerContext=CLI

Re: how to track the delete operations in HDFS?

Contributor

@jyoung, i can relate to your answer and descreption...thanks!