Support Questions

Find answers, ask questions, and share your expertise

If I delete a file in hdfs having replication factor as 3, whether all three copies will be deleted ?

avatar
 
1 ACCEPTED SOLUTION

avatar

@Sudharsan Ganeshkumar

Actually any file stored in hdfs is split in blocks (chunks of data) and each block is replicated 3 times by default. When you delete a file you remove the metadata pointing to the blocks that is stored in Namenode. Blocks are deleted when there is no reference to them in the Namenode metadata. This is important to mention since you could have snapshots, or files in Trash folders still referencing the blocks, if this happens those blocks wont be deleted until the snapshot of files under Trash folders are also removed.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6

avatar

@Sudharsan Ganeshkumar

Actually any file stored in hdfs is split in blocks (chunks of data) and each block is replicated 3 times by default. When you delete a file you remove the metadata pointing to the blocks that is stored in Namenode. Blocks are deleted when there is no reference to them in the Namenode metadata. This is important to mention since you could have snapshots, or files in Trash folders still referencing the blocks, if this happens those blocks wont be deleted until the snapshot of files under Trash folders are also removed.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar

@Sudharsan Ganeshkumar perhaps a snapshot or a copy of a file pointing to the same blocks. Please remember to login and accept the answer if you think it has addressed your question.

avatar
@Felix Albani

If i delete my files in trash, is there any possibility to have reference anywhere ?

avatar
Expert Contributor

No.

@Sudharsan Ganeshkumar

If you delete your files from Trash and there is no snapshot available for the same file, Your file will have no references in namenode metadata and file will be removed completely.

avatar
@gulshad.ansari

Where can we find the snapshot, exacty which location.

avatar
Expert Contributor

@Sudharsan Ganeshkumar

Snapshots are stored in the same path under .snapshot directory.

for example, If you take snapshot of /user/root, it would be stored in /user/root/.snapshot directory. An example is given below.

[hdfs@sandbox ~]$ hdfs dfsadmin -allowSnapshot /user/root/testsnap
snapsAllowing snaphot on /user/root/testsnaps succeeded
[root@sandbox ~]# hdfs dfs -createSnapshot /user/root/testsnaps snap1
Created snapshot /user/root/testsnaps/.snapshot/snap1
[gulshad@sandbox ~]$ hdfs dfs -createSnapshot /user/gulshad
Created snapshot /user/gulshad/.snapshot/s20180831-145829.441 

To get all Snapshotable directories, run below command.

[root@sandbox ~]# sudo -su hdfs hdfs lsSnapshottableDir
drwxr-xr-x 0 root    hdfs 0 2018-07-26 05:29 3 65536 /user/root/testsnaps
drwxr-xr-x 0 hdfs    hdfs 0 2018-08-01 14:58 1 65536 /proj/testsnap
drwxr-xr-x 0 gulshad hdfs 0 2018-08-01 14:58 1 65536 /user/gulshad