Support Questions

transient1 · ‎11-01-2016

Hello,

I am planning to enable the HDFS Storage-Balancer as per this article: https://www.cloudera.com/documentation/enterprise/5-5-x/topics/admin_dn_storage_balancing.html. I plan to use the defaults, but change the "dfs.datanode. available-space- volume-choosing- policy.balanced- space-preference- fraction" to 1.0 (understanding that this could result in some bottlenecks.

My questions about this process are:

1) We are also running HBase in this cluster, with RegionServers colocated with the Datanodes. Are there any gotchas that should be considered given this?

2) We don't have CM Enterprise, so the official rolling restart is not available to us. Is it possible to restart the Datanode role on nodes individually?

Thank you.

Harsh J · ‎11-09-2016

> 1) We are also running HBase in this cluster, with RegionServers colocated with the Datanodes. Are there any gotchas that should be considered given this?

The DN writes the blocks in typically round-robin'd manner across the disk list, but in your configuration if a disk is found to match the threshold it will select that disk over and over until the threshold gets reached.

If the threshold is very large (many thousand full blocks required to reach the divide) then the RS performance can likely suffer a bit when its trying to flush, replay or compact in parallel. This would go away when the disk rotation falls back to round robin due to no volume being in violation of the space threshold.

If you're OK in observing some small slowness (assuming a small difference threshold, and disks not being cleaned and reinserted too often) then you should be good. If however your HBase usage is very latency bound, then consider using a smaller preference fraction so it does not focus on pumping all work onto a single or specific set of disks when the threshold is found to be crossed.

> 2) We don't have CM Enterprise, so the official rolling restart is not available to us. Is it possible to restart the Datanode role on nodes individually?

You can do the restarts one by one from the HDFS -> Instances page or the API, but you'll need to manually ensure that a DN has come back up in a functional, connected state before moving onto another (by checking the DN's metrics or its logs). The enterprise rolling restart does that check automatically as it progresses.

CM API is documented at http://cloudera.github.io/cm_api/

View solution in original post

Harsh J · ‎11-09-2016

> 1) We are also running HBase in this cluster, with RegionServers colocated with the Datanodes. Are there any gotchas that should be considered given this?

The DN writes the blocks in typically round-robin'd manner across the disk list, but in your configuration if a disk is found to match the threshold it will select that disk over and over until the threshold gets reached.

If the threshold is very large (many thousand full blocks required to reach the divide) then the RS performance can likely suffer a bit when its trying to flush, replay or compact in parallel. This would go away when the disk rotation falls back to round robin due to no volume being in violation of the space threshold.

If you're OK in observing some small slowness (assuming a small difference threshold, and disks not being cleaned and reinserted too often) then you should be good. If however your HBase usage is very latency bound, then consider using a smaller preference fraction so it does not focus on pumping all work onto a single or specific set of disks when the threshold is found to be crossed.

> 2) We don't have CM Enterprise, so the official rolling restart is not available to us. Is it possible to restart the Datanode role on nodes individually?

You can do the restarts one by one from the HDFS -> Instances page or the API, but you'll need to manually ensure that a DN has come back up in a functional, connected state before moving onto another (by checking the DN's metrics or its logs). The enterprise rolling restart does that check automatically as it progresses.

CM API is documented at http://cloudera.github.io/cm_api/

transient1 · ‎11-11-2016

Thank you!

Cloudera Community

Support Questions

Enable HDFS Storage-Balancer and Role Restart