Member since
10-26-2015
10
Posts
29
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1629 | 12-02-2016 09:09 PM | |
2505 | 06-03-2016 12:05 AM | |
15723 | 06-02-2016 10:09 PM | |
3063 | 05-28-2016 12:40 AM |
07-06-2021
10:26 AM
@diplompils It is not necessary that file is lost if you are getting the output as false for recoverLease command. Usually file can't be deleted until it has lease acquired and not explicitly deleted using rm command. You can try below- hdfs debug recoverLease -path <file> -retries 10 Or you may check - https://issues.apache.org/jira/browse/HDFS-8576
... View more
05-28-2016
12:22 AM
3 Kudos
Answers by @Sagar Shimpi and @Lester Martin look pretty good to me. Some further explanations:
How does snapshots help for Disaster Recovery? What are the best practices around using snapshots for DR purposes? Especially trying to understand when data is directly stored on HDFS, Hive data and HBase data If you're using the current distcp for DR (i.e., using distcp copying data from one cluster to your backup cluster), you have an option to utilize snapshot to do incremental backup so as to improve the distcp performance/efficiency. More specifically, you can choose to take snapshots in both the source and the backup cluster and use -diff option for the distcp command. Then instead of blindly copying all the data, the distcp will first compute the difference between the given snapshots, and only copy the difference to the backup cluster. As I understand, no data is copied for snapshots, but only metadata is maintained for the blocks added/ modified / deleted. If that’s the case, just wondering what happens when the comamnd hdfs dfs -rm /data/snapshot-dir/file1 is run. Will the file be moved to the trash? If so, will the snapshot maintain the reference to the entry in trash? Will trach eviction has any impact in this case? Yes, if you have not skipped the trash, the file will be moved to the trash, and in the meanwhile, you can still access the file using the corresponding snapshot path. How does snapshots work along with HDFS quotas. For example, assume a directory with a quota of 1GB with snapshotting enabled. Assume the directory is closer to its full quota and a user deleted a large file to store some other dataset. Will the new data be allowed to be saved to the directory or will the operation be stopped because the quota limits have been exceeded? No, if the file belongs to the snapshot (i.e., the file was created before a snapshot was taken), you will not release quota by deleting it. You may have to delete some old snapshots or increase your quota limit. Also in some old hadoop versions you may find the snapshots also affect the namespace quota usage in a strange way, i.e., sometimes deleting a file can increase the quota usage. This has been fixed by the latest version of HDP.
... View more