Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

what is BP, Blk in fsck output? Can you explain what each thing means in the output?

SOLVED Go to solution

what is BP, Blk in fsck output? Can you explain what each thing means in the output?

Contributor

BP-929597290-192.0.0.2-1439573305237:blk_1074084574_344316 len=2 repl=3 [DatanodeInfoWithStorage[192.0.0.9:1000,DS-730a75d3-046c-4254-990a-4eee9520424f,DISK], DatanodeInfoWithStorage[192.0.0.1:1000,DS-fc6ee5c7-e76b-4faa-b663-58a60240de4c,DISK], DatanodeInfoWithStorage[192.0.0.3:1000,DS-8ab81b26-309e-42d6-ae14-26eb88387cad,DISK]

 

What does Bp and BLK storing and y is it displayed for my fsck command

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: what is BP, Blk in fsck output? Can you explain what each thing means in the output?

Master Guru
FSCK prints the full identifier of a block, which is useful in some contexts depending on what you're about to troubleshoot or investigate. Here's a break down:

BP-929597290-192.0.0.2-1439573305237 = This is a BlockPool (BP) ID. Its the mark of a NameNode's ownership of the block in question. You might recall that HDFS now supports federated namespaces, wherein multiple NameNodes may be served by a single DataNode. This ID is how each NameNode is uniquely identified to be the owner of a held block ID. Even though you do not explicitly utilise federation, the block-pool concept is now inbuilt into the identifier design of HDFS by default. See http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/Federation.html#Multip...

blk_1074084574_344316 = This is the block ID (blk_X_Y). Each block under every file is uniquely identified by a number X and a sub-number Y (generation stamp). More on block IDs and HDFS architecture can be read in the AOS book: http://aosabook.org/en/hdfs.html

DS-730a75d3-046c-4254-990a-4eee9520424f,DISK = This is a storage identifier ID. It helps tell that on the specified DN IP:PORT, which disk (hashed identifier) is actually the one holding the data, and what is the type of the disk (DISK). HDFS now supports tiered storage, in which this comes useful: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html (aside of other things).
1 REPLY 1

Re: what is BP, Blk in fsck output? Can you explain what each thing means in the output?

Master Guru
FSCK prints the full identifier of a block, which is useful in some contexts depending on what you're about to troubleshoot or investigate. Here's a break down:

BP-929597290-192.0.0.2-1439573305237 = This is a BlockPool (BP) ID. Its the mark of a NameNode's ownership of the block in question. You might recall that HDFS now supports federated namespaces, wherein multiple NameNodes may be served by a single DataNode. This ID is how each NameNode is uniquely identified to be the owner of a held block ID. Even though you do not explicitly utilise federation, the block-pool concept is now inbuilt into the identifier design of HDFS by default. See http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/Federation.html#Multip...

blk_1074084574_344316 = This is the block ID (blk_X_Y). Each block under every file is uniquely identified by a number X and a sub-number Y (generation stamp). More on block IDs and HDFS architecture can be read in the AOS book: http://aosabook.org/en/hdfs.html

DS-730a75d3-046c-4254-990a-4eee9520424f,DISK = This is a storage identifier ID. It helps tell that on the specified DN IP:PORT, which disk (hashed identifier) is actually the one holding the data, and what is the type of the disk (DISK). HDFS now supports tiered storage, in which this comes useful: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html (aside of other things).