- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
dfs.data.dir question
- Labels:
-
Apache Hadoop
Created on ‎10-18-2013 08:38 AM - edited ‎09-16-2022 01:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I have several slave nodes with varying number of drives mounted in each of them, then how does dfs.data.dir property setup works? If I include the directory name that is available in one of the datanode that is not available in other in this property, then does hadoop skip this value/drive for that particular datanode where this directory is missing?
Thanks in advance!
Created ‎10-18-2013 08:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, this isn't how it works. If you use the same configuration:
/mnt, /mnt2, /mnt3, /mnt4, /mnt5
A host who just has 3 drives (/mnt, /mnt2, /mnt3) will fail to start, depending on the value of dfs.datanode.failed.volumes.tolerated (default 0). You're going to need to set up each server properly with the right value for dfs.data.dir. For this (and other) reason(s), a homogenous cluster setup is usually preferred.
Created ‎10-18-2013 08:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, this isn't how it works. If you use the same configuration:
/mnt, /mnt2, /mnt3, /mnt4, /mnt5
A host who just has 3 drives (/mnt, /mnt2, /mnt3) will fail to start, depending on the value of dfs.datanode.failed.volumes.tolerated (default 0). You're going to need to set up each server properly with the right value for dfs.data.dir. For this (and other) reason(s), a homogenous cluster setup is usually preferred.
