- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDFS Snapshots, does the data also can be back up?
- Labels:
-
HDFS
Created ‎07-15-2020 07:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
can we backup also the data inside the hdfs files? or just the directories?
Created ‎07-15-2020 09:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery.
The implementation of HDFS Snapshots is efficient:
Snapshot creation is instantaneous: the cost is O(1) excluding the inode lookup time.
Additional memory is used only when modifications are made relative to a snapshot: memory usage is O(M), where M is the number of modified files/directories.
Blocks in datanodes are not copied: the snapshot files record the block list and the file size. There is no data copying.
Snapshots do not adversely affect regular HDFS operations: modifications are recorded in reverse chronological order so that the current data can be accessed directly. The snapshot data is computed by subtracting the modifications from the current data.
Created ‎07-15-2020 11:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Govins28 its says that it can be a data backup but if you check on this pointer : Blocks in datanodes are not copied: the snapshot files record the block list and the file size. There is no data copying.. it says here there is no data copying
Created ‎07-15-2020 11:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Mondi : Yes it refers to meta data and the point in which the snapshot is done. You can revert or restore the data from the snapshot at any point in time.
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html
