Support Questions

fil · ‎04-08-2015

Hi dear community!

i'll very appreciate if someone could tell how match number of pipelines with number of the blocks in the file?

i mean pipelines that are creating for writhig HDFS block (https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf)

is number of pipelines equal of number of blocks in required file? In other words one pipeline created for one file block, right? Or one pipeline per file (multiple blocks)?

thanks!

Harsh J · ‎04-08-2015

Every block's replica location allocation gets a different set of DNs.
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).

View solution in original post

Harsh J · ‎04-08-2015

A pipeline is created to write multiple replicas for every single
under-construction block. For every block's end, the pipeline is
completed and closed, and a subsequent new pipeline is opened for the
next block (if there is one). There's no notion of "number of
pipelines" given that its a sequential operation - why do you seek
such a number?

fil · ‎04-08-2015

thanks for your reply!

let me clarify my question:

does each file block has unique set of the Datanodes?

or all blocks writes in single set of datanodes?

thanks!

Harsh J · ‎04-08-2015

Every block's replica location allocation gets a different set of DNs.
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).

fil · ‎04-08-2015

Thanks! Everything clear now!

Cloudera Community

Support Questions

How match number of pipelines with number of the blocks in the file