I have a folder in HDFS that will have files coming in everyday. I want to duplicate the folder in such a way that whenever a new file comes to the original folder, I want that to be duplicated/synced in the duplicate folder.
Basically, I want to sync a folder with another in HDFS
How can we achieve that in hadoop?
@mbigelow I would go with syncing and scheduling the sync on regular basis. But I am confused with distcp and cron usage together.
Could you please give me an example on how we can achieve this