- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How match number of pipelines with number of the blocks in the file
- Labels:
-
HDFS
Created on 04-08-2015 04:56 AM - edited 09-16-2022 02:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi dear community!
i'll very appreciate if someone could tell how match number of pipelines with number of the blocks in the file?
i mean pipelines that are creating for writhig HDFS block (https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf)
is number of pipelines equal of number of blocks in required file? In other words one pipeline created for one file block, right? Or one pipeline per file (multiple blocks)?
thanks!
Created 04-08-2015 05:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).
Created 04-08-2015 05:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
under-construction block. For every block's end, the pipeline is
completed and closed, and a subsequent new pipeline is opened for the
next block (if there is one). There's no notion of "number of
pipelines" given that its a sequential operation - why do you seek
such a number?
Created 04-08-2015 05:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for your reply!
let me clarify my question:
does each file block has unique set of the Datanodes?
or all blocks writes in single set of datanodes?
thanks!
Created 04-08-2015 05:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They may all have one common
DataNode if the writer client is running on a host that also runs a
DataNode, but the other replicas will be randomly selected (within or
outside of racks, depending on topology).
Created 04-08-2015 06:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
