Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

what is BP, Blk in fsck output? Can you explain what each thing means in the output?

avatar
Contributor

BP-929597290-192.0.0.2-1439573305237:blk_1074084574_344316 len=2 repl=3 [DatanodeInfoWithStorage[192.0.0.9:1000,DS-730a75d3-046c-4254-990a-4eee9520424f,DISK], DatanodeInfoWithStorage[192.0.0.1:1000,DS-fc6ee5c7-e76b-4faa-b663-58a60240de4c,DISK], DatanodeInfoWithStorage[192.0.0.3:1000,DS-8ab81b26-309e-42d6-ae14-26eb88387cad,DISK]

 

What does Bp and BLK storing and y is it displayed for my fsck command

 

 

1 ACCEPTED SOLUTION

avatar
Mentor
FSCK prints the full identifier of a block, which is useful in some contexts depending on what you're about to troubleshoot or investigate. Here's a break down:

BP-929597290-192.0.0.2-1439573305237 = This is a BlockPool (BP) ID. Its the mark of a NameNode's ownership of the block in question. You might recall that HDFS now supports federated namespaces, wherein multiple NameNodes may be served by a single DataNode. This ID is how each NameNode is uniquely identified to be the owner of a held block ID. Even though you do not explicitly utilise federation, the block-pool concept is now inbuilt into the identifier design of HDFS by default. See http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/Federation.html#Multip...

blk_1074084574_344316 = This is the block ID (blk_X_Y). Each block under every file is uniquely identified by a number X and a sub-number Y (generation stamp). More on block IDs and HDFS architecture can be read in the AOS book: http://aosabook.org/en/hdfs.html

DS-730a75d3-046c-4254-990a-4eee9520424f,DISK = This is a storage identifier ID. It helps tell that on the specified DN IP:PORT, which disk (hashed identifier) is actually the one holding the data, and what is the type of the disk (DISK). HDFS now supports tiered storage, in which this comes useful: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html (aside of other things).

View solution in original post

1 REPLY 1

avatar
Mentor
FSCK prints the full identifier of a block, which is useful in some contexts depending on what you're about to troubleshoot or investigate. Here's a break down:

BP-929597290-192.0.0.2-1439573305237 = This is a BlockPool (BP) ID. Its the mark of a NameNode's ownership of the block in question. You might recall that HDFS now supports federated namespaces, wherein multiple NameNodes may be served by a single DataNode. This ID is how each NameNode is uniquely identified to be the owner of a held block ID. Even though you do not explicitly utilise federation, the block-pool concept is now inbuilt into the identifier design of HDFS by default. See http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/Federation.html#Multip...

blk_1074084574_344316 = This is the block ID (blk_X_Y). Each block under every file is uniquely identified by a number X and a sub-number Y (generation stamp). More on block IDs and HDFS architecture can be read in the AOS book: http://aosabook.org/en/hdfs.html

DS-730a75d3-046c-4254-990a-4eee9520424f,DISK = This is a storage identifier ID. It helps tell that on the specified DN IP:PORT, which disk (hashed identifier) is actually the one holding the data, and what is the type of the disk (DISK). HDFS now supports tiered storage, in which this comes useful: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html (aside of other things).