Is it possible to copy HDFS snapshots to another cluster and use them (via distCP for instance)? What does this add compared to copy data directly (not snapshots) via distCP for DR?
The big benefit that you get by utilizing snapshots with distCP is that you can do incremental backups when distCP'ing the snapshotted directory in the future by leveraging the differential between the snapshots. Jing provides some context around this in the second answer here. The work to complete this is discussed in HDFS-7535 and some more context is provided there. This was first pulled into Hadoop 2.7.0
The big benefit that you get by utilizing snapshots with distCP is that you can do incremental backups when distCP'ing the snapshotted directory in the future by leveraging the differential between the snapshots. Jing provides some context around this in the second answer here. The work to complete this is discussed in HDFS-7535 and some more context is provided there. This was first pulled into Hadoop 2.7.0