- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to recover missing blocks of hdfs after delete a data dir in the datanode by a mistake
- Labels:
-
HDFS
Created on ‎04-11-2016 07:35 PM - edited ‎09-16-2022 03:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi:
The datanode has three dir to store the data such as /data/1,/data/2, /data/3. I delete the /data/1 in the datanode by mistake. Then the hdfs shows missing blocks. I copy the data from /data/3 to /data/1 , but it didn't work.
Thanks for regard
leezy
Created ‎04-11-2016 09:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the context I'm assuming you have setup a 1 node test cluster?
HDFS replicates data between different nodes, the /data/1, /data/2, and /data/3 are just different drives. HDFS will use each of those drives to store blocks, and will replicate those blocks to other nodes in the cluster.
by Deleting /data/1 deleted the blocks on that drive. /data/2 or /data/3 won't have those blocks. If you have more than 1 node, HDFS will replicate a copy of the blocks that were stored on /data/1 to one of those other drives, likely spread out among all the available drives on that node. when /data/1 was deleted in that case, HDFS will detect those blocks went missing the next time the datanode checks in and start automatically repairing the under-replicated blocks.
Missing blocks implies that the only copy of the block has gone missing, so in that case the only way to recover them would have been to do drive recovery operations on that drive. This will be the case in single node test clusters, thus the assumption above.
Created ‎04-11-2016 09:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the context I'm assuming you have setup a 1 node test cluster?
HDFS replicates data between different nodes, the /data/1, /data/2, and /data/3 are just different drives. HDFS will use each of those drives to store blocks, and will replicate those blocks to other nodes in the cluster.
by Deleting /data/1 deleted the blocks on that drive. /data/2 or /data/3 won't have those blocks. If you have more than 1 node, HDFS will replicate a copy of the blocks that were stored on /data/1 to one of those other drives, likely spread out among all the available drives on that node. when /data/1 was deleted in that case, HDFS will detect those blocks went missing the next time the datanode checks in and start automatically repairing the under-replicated blocks.
Missing blocks implies that the only copy of the block has gone missing, so in that case the only way to recover them would have been to do drive recovery operations on that drive. This will be the case in single node test clusters, thus the assumption above.
