Created 10-09-2015 02:47 AM
We run Cloudera with HBase on a cluster with 13 DataNodes (+ 2 non-DataNode nodes).
Each DataNode has 3.6 TiB of space (2 equally sized disks in raid 0), and we have 24 TiB of data used by HDFS.
However, the data is very unevenly distributed, with two servers that are now out of disk space, and the 11 other servers using about 50% of their disk space.
We have about 16 "red lights" in Cloudera right now that are all caused by running out of disk space on these two serers (mostly about missing disk space for logs).
When we try to run the balancer, it exits after a few minutes with this error:
No block has been moved for 5 iterations. Exiting.
Is this related to the following bug?
https://issues.apache.org/jira/browse/HDFS-6621
The fix version of that bug is 2.6.0 and we're running HDFS 2.6.0-cdh5.4.2.
And what can we do to fix it?
Created 10-09-2015 08:02 AM
David Wilder, Community Manager
Created 10-09-2015 03:02 AM
Created 10-09-2015 03:41 AM
Thank you for the answer. It's all DFS usage:
$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/md2 3,6T 3,4T 4,0G 100% /
$ du -hs /dfs/dn/ 3,4T /dfs/dn/
And it's not balanced at all, unfortunately (output from dfsadmin -report, see hdfs-8 and hdfs-9 which are the affected hosts):
Configured Capacity: 49343113617408 (44.88 TB) Present Capacity: 47369416204288 (43.08 TB) DFS Remaining: 21149683597312 (19.24 TB) DFS Used: 26219732606976 (23.85 TB) DFS Used%: 55.35% Under replicated blocks: 16418 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (13): Name: x.x.x.x:50010 (hdfs-8.xxx) Hostname: hdfs-8.xxx Rack: /dc19 Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 3685917872128 (3.35 TB) Non DFS Used: 105435455488 (98.19 GB) DFS Remaining: 4270796800 (3.98 GB) DFS Used%: 97.11% DFS Remaining%: 0.11% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 33 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-2.xxx) Hostname: hdfs-2.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1767240470528 (1.61 TB) Non DFS Used: 108024397824 (100.61 GB) DFS Remaining: 1920359256064 (1.75 TB) DFS Used%: 46.56% DFS Remaining%: 50.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 208 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-6.xxx) Hostname: hdfs-6.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1631343329280 (1.48 TB) Non DFS Used: 107747889152 (100.35 GB) DFS Remaining: 2056532905984 (1.87 TB) DFS Used%: 42.98% DFS Remaining%: 54.18% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 216 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-15.xxx) Hostname: hdfs-15.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1888278859776 (1.72 TB) Non DFS Used: 101076938752 (94.14 GB) DFS Remaining: 1806268325888 (1.64 TB) DFS Used%: 49.75% DFS Remaining%: 47.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 190 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-12.xxx) Hostname: hdfs-12.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1545568223232 (1.41 TB) Non DFS Used: 100874694656 (93.95 GB) DFS Remaining: 2149181206528 (1.95 TB) DFS Used%: 40.72% DFS Remaining%: 56.62% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 228 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-13.xxx) Hostname: hdfs-13.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1879598047232 (1.71 TB) Non DFS Used: 100941463552 (94.01 GB) DFS Remaining: 1815084613632 (1.65 TB) DFS Used%: 49.52% DFS Remaining%: 47.82% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 206 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-9.xxx) Hostname: hdfs-9.xxx Rack: /dc13 Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 3690058039296 (3.36 TB) Non DFS Used: 105563381760 (98.31 GB) DFS Remaining: 2703360 (2.58 MB) DFS Used%: 97.22% DFS Remaining%: 0.00% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 31 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-1.xxx) Hostname: hdfs-1.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1514972078080 (1.38 TB) Non DFS Used: 728471805952 (678.44 GB) DFS Remaining: 1552180240384 (1.41 TB) DFS Used%: 39.91% DFS Remaining%: 40.89% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 205 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-10.xxx) Hostname: hdfs-10.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1584416079872 (1.44 TB) Non DFS Used: 101046075392 (94.11 GB) DFS Remaining: 2110161969152 (1.92 TB) DFS Used%: 41.74% DFS Remaining%: 55.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 195 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-5.xxx) Hostname: hdfs-5.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1900345761792 (1.73 TB) Non DFS Used: 108638838784 (101.18 GB) DFS Remaining: 1786639523840 (1.62 TB) DFS Used%: 50.07% DFS Remaining%: 47.07% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 205 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-14.xxx) Hostname: hdfs-14.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1457428815872 (1.33 TB) Non DFS Used: 101012418560 (94.08 GB) DFS Remaining: 2237182889984 (2.03 TB) DFS Used%: 38.40% DFS Remaining%: 58.94% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 200 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-11.xxx) Hostname: hdfs-11.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1721947557888 (1.57 TB) Non DFS Used: 100971548672 (94.04 GB) DFS Remaining: 1972705017856 (1.79 TB) DFS Used%: 45.37% DFS Remaining%: 51.97% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 202 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-7.xxx) Hostname: hdfs-7.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1952617472000 (1.78 TB) Non DFS Used: 103892504576 (96.76 GB) DFS Remaining: 1739114147840 (1.58 TB) DFS Used%: 51.44% DFS Remaining%: 45.82% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 193 Last contact: Fri Oct 09 12:28:54 CEST 2015
Created 10-09-2015 08:02 AM
David Wilder, Community Manager
Created 10-09-2015 08:12 AM
Thank you! I actually just an hour ago came across that solution when reading this, so I already implemented the solution you suggested, but was still waiting to see if it'd balance out evenly before I posted:
http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera
Regardless, thank you very much for your help in resolving this issue.