Created 06-17-2016 01:09 AM
I'm looking for general guidelines and best practices from the field on the following two properties in hdfs-site.xml. I am looking for more than description derived from hdfs-default.xml. What are people seeing and what are some of the production values for the two configuration properties?
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
Created 06-17-2016 03:58 AM
Hi Artem, we do not recommend using AvailableSpaceVolumeChoosingPolicy. It can cause a subset of disk drives to become a bottleneck for writes. See HDFS-8538 for some more discussion on this.
A new HDFS tool called the DiskBalancer is under active development (HDFS-1312). It will allow administrators to recover from skewed distribution caused by replacing failed disks or just adding new disks.
Created 06-17-2016 03:58 AM
Hi Artem, we do not recommend using AvailableSpaceVolumeChoosingPolicy. It can cause a subset of disk drives to become a bottleneck for writes. See HDFS-8538 for some more discussion on this.
A new HDFS tool called the DiskBalancer is under active development (HDFS-1312). It will allow administrators to recover from skewed distribution caused by replacing failed disks or just adding new disks.
Created 06-17-2016 02:47 PM
Hi @Arpit Agarwal I don't know the intricacies of this. But trying to understand which is a better option - to run the balancer as a recovery mechanism at regular intervals or use a better placement policy while writing the blocks itself. I presume the default block placement policy is RR. So if the placement is round-robin, then the smaller disks are filled-up faster. Instead if the placement policy can take available space and as well as IO throughput for each disk, wouldn't that be a better choice?
Also, as documented these two properties are only applicable when dfs.datanode.fsdataset.volume.choosing.policy is set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy (https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml) But I couldn't find any property named dfs.datanode.fsdataset.volume.choosing.policy. Please let me know where this is set.
Please correct me if I am wrong in my understanding.
Created 06-17-2016 06:17 PM
Hi @Greenhorn Techie, yes I agree the ideal placement policy would factor in available space and IO load. However there is no implementation that currently does that.
The property "dfs.datanode.fsdataset.volume.choosing.policy is defined in hdfs-default.xml:
<property> <name>dfs.datanode.fsdataset.volume.choosing.policy</name> <value></value> <description> The class name of the policy for choosing volumes in the list of directories. Defaults to org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy. If you would like to take into account available disk space, set the value to "org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy". </description> </property>
Created 06-17-2016 07:46 PM
Thanks @Arpit Agarwal for your response. So finally it boils down to choosing RR vs AvailableSpace policies and Hortonworks recommends using RR policy with DiskBalancer vs Cloudera's recommendation of AvailableSpace policy? Am I correct in saying that? 🙂
Created 06-17-2016 09:23 PM
Hortonworks recommends using the default RoundRobin policy.
Created 01-11-2017 07:04 PM
I have the exact same question. @Artem Ervits have you come to any conclusion since this thread died last July?
Created 01-11-2017 11:34 PM
@Anant Rathi I have some verified answers in this thread from engineering and also another answer from @Chris Nauroth there's a reference blog http://gbif.blogspot.com/2015/05/dont-fill-your-hdfs-disks-upgrading-to.html we don't have field agreement to one or the other policy p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Calibri} span.s1 {font-kerning: none}
AvailableSpaceVolumeChoosingPolicy is not something that we have ever formally tested or certified. It was developed at Cloudera. We do not certify it under our support.