Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How match number of pipelines with number of the blocks in the file

avatar
Rising Star

Hi dear community!

 

i'll very appreciate if someone could tell how match number of pipelines with number of the blocks in the file?

i mean pipelines that are creating for writhig HDFS block (https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf)

is number of pipelines equal of number of blocks in required file? In other words one pipeline created for one file block, right? Or one pipeline per file (multiple blocks)?

 

thanks!

1 ACCEPTED SOLUTION

avatar
Mentor
Every block's replica location allocation gets a different set of DNs.
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).

View solution in original post

4 REPLIES 4

avatar
Mentor
A pipeline is created to write multiple replicas for every single
under-construction block. For every block's end, the pipeline is
completed and closed, and a subsequent new pipeline is opened for the
next block (if there is one). There's no notion of "number of
pipelines" given that its a sequential operation - why do you seek
such a number?

avatar
Rising Star

thanks for your reply!

let me clarify my question:

does each file block has unique set of the Datanodes?

or all blocks writes in single set of datanodes?

 

thanks!

avatar
Mentor
Every block's replica location allocation gets a different set of DNs.
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).

avatar
Rising Star
Thanks! Everything clear now!