HDFS Snapshot

New Contributor

Can someone explain me about HDFS Snapshot. I know it is point in time copies of file system, does it mean it keeps another copy of data? Lets say i have consumed 10 TB out of 100 TB and taken HDFS snapshot from / directory.


1. Does it keep another copy of 10TB under /.snapshot?

2. If snapshot is only metadata, how does NN constructs data when datablocks are deleted from datanode? or woudn't it delete datablocks at all? 


I'm confused can someone explain this?





Rising Star
1. No, snapshot is just for the metadata operation.

2. Once you make particular directory snapshottable, the blocks belonging
the underlying files never be deleted.

Master Guru
In addition to Dice's notes, please also read the design and efficiency overview at It will help gain a better understanding of the feature.