I am trying to understand distcp delete option basically each time I do distcp i would like to overwrite destination directories.
The overwrite option only does it with files so if there is a file in destination with same content of a file in source but a different name would not be overriden but I would like to do it at the directory level as well.
From the hadoop docs
Delete the files existing in the dst but not in src
I think thats expected behaaviour. For your scenario I would better suggest to go for DistCp between Snapshot Difference.
distcp -update -diff -delete /source /destination
How to Use This Feature
To use this feature, you should first make sure all assumptions are met. Typical steps are described as follows:
distcpcommand that copies everything from s0 to the target directory (command line is like
distcp -update <sourceDir>/.snapshot/s0 <targetDir>).
distcp -update -diff s0 s1 <sourceDir> <targetDir>to copy all changes between s0 and s1 to the target directory.