Created 07-26-2021 07:49 AM
There are user directories under hdfs:///user/
When a user `foo` retires, I need to delete a root directory for the user: hdfs:///user/foo
However, it occasionally fails because of the snapshottable directories under the user root directory.
Checking all the subdirectories whether it's snapshottable, disallowing snapshottable subdirectories , and then deleting the user's root directory doesn't seem to be the best way.
(Or if this is the best way, I cannot come up with a simple code..)
Is there a command to disallow snapshots for all the subdirectories?
How can I effectively delete a directory that might have snapshots?
Created on 07-26-2021 02:53 PM - edited 07-26-2021 02:53 PM
Here is a walkthrough on how to delete a snapshot
Created a directory
$ hdfs dfs -mkdir -p /app/tomtest
Changed the owner
$ hdfs dfs -chown -R tom:developer /app/tomtest
To be able to create a snapshot the directory has to be snapshottable
$ hdfs dfsadmin -allowSnapshot /app/tomtest
Allowing snaphot on /app/tomtest succeeded
Now I created 3 snapshots
$ hdfs dfs -createSnapshot /app/tomtest sipo
Created snapshot /app/tomtest/.snapshot/sipo
$ hdfs dfs -createSnapshot /app/tomtest coo
Created snapshot /app/tomtest/.snapshot/coo
$ hdfs dfs -createSnapshot /app/tomtest tap2
Created snapshot /app/tomtest/.snapshot/tap2
Confirm the directory is snapshottable
$ hdfs lsSnapshottableDir
drwxr-xr-x 0 tom developer 0 2021-07-26 23:14 3 65536 /app/tomtest
List all the snapshots in the directory
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 3 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/coo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Now I need to delete the snapshot coo
$ hdfs dfs -deleteSnapshot /app/tomtest/ coo
Confirm the snapshot is gone
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 2 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Voila
To delete a snapshot the format is hdfs dfs -deleteSnapshot <path> <snapshotName> i.e
hdfs dfs -deleteSnapshot /app/tomtest/ coo notice the space and omittion of the .snapshot as all .(dot) files the snapshot directory is not visible with normal hdfs command
The -ls command gives 0 results
$ hdfs dfs -ls /app/tomtest/
The special command shows the 2 remaining snapshots
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 2 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Is there a command to disallow snapshots for all the subdirectories? Yes there is only after you have deleted all the snapshots therein demo, or better at directory creation time you can disallow snapshots
$ hdfs dfsadmin -disallowSnapshot /app/tomtest/
disallowSnapshot: The directory /app/tomtest has snapshot(s). Please redo the operation after removing all the snapshots.
The only way I have found which works when for me and permits me to have a cup of coffee is to first list all the snapshots and copy-paste the delete even if there are 60 snapshots it works and I only get back when the snapshots are gone or better still do something else while the deletion is going on not automated though the example
The below would run concurrently
hdfs dfs -deleteSnapshot /app/tomtest/ sipo
.....
....
hdfs dfs -deleteSnapshot /app/tomtest/ tap2
-deleteSnapshot skips trash by default!
Happy hadooping
Created on 07-26-2021 02:53 PM - edited 07-26-2021 02:53 PM
Here is a walkthrough on how to delete a snapshot
Created a directory
$ hdfs dfs -mkdir -p /app/tomtest
Changed the owner
$ hdfs dfs -chown -R tom:developer /app/tomtest
To be able to create a snapshot the directory has to be snapshottable
$ hdfs dfsadmin -allowSnapshot /app/tomtest
Allowing snaphot on /app/tomtest succeeded
Now I created 3 snapshots
$ hdfs dfs -createSnapshot /app/tomtest sipo
Created snapshot /app/tomtest/.snapshot/sipo
$ hdfs dfs -createSnapshot /app/tomtest coo
Created snapshot /app/tomtest/.snapshot/coo
$ hdfs dfs -createSnapshot /app/tomtest tap2
Created snapshot /app/tomtest/.snapshot/tap2
Confirm the directory is snapshottable
$ hdfs lsSnapshottableDir
drwxr-xr-x 0 tom developer 0 2021-07-26 23:14 3 65536 /app/tomtest
List all the snapshots in the directory
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 3 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/coo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Now I need to delete the snapshot coo
$ hdfs dfs -deleteSnapshot /app/tomtest/ coo
Confirm the snapshot is gone
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 2 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Voila
To delete a snapshot the format is hdfs dfs -deleteSnapshot <path> <snapshotName> i.e
hdfs dfs -deleteSnapshot /app/tomtest/ coo notice the space and omittion of the .snapshot as all .(dot) files the snapshot directory is not visible with normal hdfs command
The -ls command gives 0 results
$ hdfs dfs -ls /app/tomtest/
The special command shows the 2 remaining snapshots
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 2 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2
Is there a command to disallow snapshots for all the subdirectories? Yes there is only after you have deleted all the snapshots therein demo, or better at directory creation time you can disallow snapshots
$ hdfs dfsadmin -disallowSnapshot /app/tomtest/
disallowSnapshot: The directory /app/tomtest has snapshot(s). Please redo the operation after removing all the snapshots.
The only way I have found which works when for me and permits me to have a cup of coffee is to first list all the snapshots and copy-paste the delete even if there are 60 snapshots it works and I only get back when the snapshots are gone or better still do something else while the deletion is going on not automated though the example
The below would run concurrently
hdfs dfs -deleteSnapshot /app/tomtest/ sipo
.....
....
hdfs dfs -deleteSnapshot /app/tomtest/ tap2
-deleteSnapshot skips trash by default!
Happy hadooping
Created 07-26-2021 10:13 PM
Thanks for the detailed steps, Shelton.
To make sure if I understood properly:
When there are snapshottable directories at
hdfs://user/foo/1/1-2
hdfs://user/foo/3/3-2
I expected something like:
hdfs dfs -disallowSnapshot -subDirsIncluded -recursively hdfs://user/foo
without the need to know where the snapshots are, but you mean that there isn't such simple command, right?
Instead, I shouild list all the snapshottable directories and delete each snapshots under the directories.
Created 07-26-2021 10:51 PM
@sipocootap2
Unfortunately, you cannot disallow snapshots in a snapshottable directory that already has snapshots!
Yes, you will have to list and delete the snapshot even if it contains subdirs you only pass the root snapshot in the hdfs dfs -deleteSnapshot command. If you had an
$ hdfs dfs -ls /app/tomtest/.snapshot
Found 2 items
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/sipo/work/john
drwxr-xr-x - tom developer 0 2021-07-26 23:14 /app/tomtest/.snapshot/tap2/work//peter
You would simply delete the snapshots like
$ hdfs dfs -deleteSnapshot /app/tomtest/ sipo
$ hdfs dfs -deleteSnapshot /app/tomtest/ tap2