@kiranpune
DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery and reporting. It expands a list of files and directories into the input to map tasks, each of which will copy a partition of the files specified in the source list that basic description.
But one can use different command-line options when running DISTCP see the official dictcp documentation below are a few options for your different use cases.
OPTIONS
-append: Incremental copy of the file with the same name but different length
-update: Overwrite if source and destination differ in size, block size, or checksum
-overwrite: Overwrite destination
-delete: Delete the files existing in the destination but not in the source
I think you can schedule or script a daily copy