Created on 03-28-2021 09:20 AM - edited 09-16-2022 07:41 AM
Hello,
How do we balance data which are stored on individual disk on a particular datanode, we have 5 disks on a single node and one of the disk is 90% full, running balancer is not fixing the issue.
Would like to get some suggestions/ comments to fix this issue.
I was going through this article and it says it is not possible to balance disks within a single node, so what other options can we use to fix this until we upgrade to CDH 6.3
CM & CDH - 5.16.3
Appreciate all inputs.
Thanks
Wert
Created 03-30-2021 01:30 AM
Hi @wert_1311 I see you have mentioned you are currently running on CM & CDH - 5.16.3. If this is correct then you can follow below article and fix the disk usage.
https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/
Let me know if any query or comment.
Created 03-29-2021 10:48 PM
Any assistance on this please ...
Created 03-30-2021 01:30 AM
Hi @wert_1311 I see you have mentioned you are currently running on CM & CDH - 5.16.3. If this is correct then you can follow below article and fix the disk usage.
https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/
Let me know if any query or comment.
Created 03-30-2021 05:35 AM
Hello @wert_1311 You can balance the disk usage of the DN storage volumes using "intra-disk balancer" feature available in CDH starting release 5.8.2 and later.
You need to enable the feature by adding the "dfs.disk.balancer.enabled" configuration to HDFS via the HDFS safety valve snippet in Cloudera Manager following the blog here - https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/
A typical disk-balancer task involves three steps (implemented via the "hdfs diskbalancer" command): plan, execute, and query.
The steps are as follows:
1. Enable intra disk balancer config in HDFS
2. "Plan" the intra disk balancer
3. Execute the created plan
4. Query the running/executed plan
5. Verify the balancer report
For more info refer the apache doc here - https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html
Thanks and Regards,
Pabitra Das
Created 03-30-2021 08:43 AM