- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hadoop read IO size
- Labels:
-
Apache Hadoop
-
HDFS
Created on 09-08-2015 06:20 PM - edited 09-16-2022 02:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi dear experts!
i'm curious how it possible to handle read IO size in my MR jobs.
for exampe, i have some file in HDFS, under the hood it's files in Linux filesystem /disk1/hadoop/.../.../blkXXX.
in ideal case this file size should be equal block size (128-256MB).
my question is how it possible to set IO size for reading operation?
thank you!
Created 09-09-2015 10:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 09-08-2015 09:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Note that HDFS Readers do not read whole blocks of data at a time, and instead stream the data via a buffered read (64k-128k typically). That the block size is X MB does not translate into a memory requirement unless you are explicitly storing the entire block in memory when streaming the read.
Created 09-09-2015 10:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
just for clarify
> stream the data via a buffered read
does size of this buffer defined by io.file.buffer.size parameter?
thanks!
Created 09-09-2015 04:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(io.file.buffer.size) but note that if you're doing short circuited reads
then another property that also applies is
(dfs.client.read.shortcircuit.buffer.size, 1 MB in bytes by default).
Created 09-09-2015 06:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you point me at source class where it's possible to read this in more details?
thanks!
Created 09-09-2015 10:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
