- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Balancer: No block has been moved for 5 iterations. Exiting.
- Labels:
-
HDFS
Created ‎10-09-2015 02:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We run Cloudera with HBase on a cluster with 13 DataNodes (+ 2 non-DataNode nodes).
Each DataNode has 3.6 TiB of space (2 equally sized disks in raid 0), and we have 24 TiB of data used by HDFS.
However, the data is very unevenly distributed, with two servers that are now out of disk space, and the 11 other servers using about 50% of their disk space.
We have about 16 "red lights" in Cloudera right now that are all caused by running out of disk space on these two serers (mostly about missing disk space for logs).
When we try to run the balancer, it exits after a few minutes with this error:
No block has been moved for 5 iterations. Exiting.
Is this related to the following bug?
https://issues.apache.org/jira/browse/HDFS-6621
The fix version of that bug is 2.6.0 and we're running HDFS 2.6.0-cdh5.4.2.
And what can we do to fix it?
Created ‎10-09-2015 08:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hdfs-8.xxx belongs to rack /dc19
hdfs-9.xxx belongs to rack /dc13
The rests of the hosts are in rack /default.
This is an incorrect usage of racks. To avoid breaking rack placement the
balancer is not able to move blocks off of these two hosts.
Consider changing these two hosts to the /default rack.
David Wilder, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎10-09-2015 03:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
space fillup is for reasons other than DFS utilised space, then that tool
will not resolve it. You can look at the 'dfsadmin -report' to see the DFS
Used % values per DN, and if they are all within 10±% of each other, then
the Balancer is not at fault here.
Have you tried checking what is consuming space on the affected hosts? Is
it the DN dirs holding the space, or is it something from YARN/etc.?
Created ‎10-09-2015 03:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the answer. It's all DFS usage:
$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/md2 3,6T 3,4T 4,0G 100% /
$ du -hs /dfs/dn/ 3,4T /dfs/dn/
And it's not balanced at all, unfortunately (output from dfsadmin -report, see hdfs-8 and hdfs-9 which are the affected hosts):
Configured Capacity: 49343113617408 (44.88 TB) Present Capacity: 47369416204288 (43.08 TB) DFS Remaining: 21149683597312 (19.24 TB) DFS Used: 26219732606976 (23.85 TB) DFS Used%: 55.35% Under replicated blocks: 16418 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (13): Name: x.x.x.x:50010 (hdfs-8.xxx) Hostname: hdfs-8.xxx Rack: /dc19 Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 3685917872128 (3.35 TB) Non DFS Used: 105435455488 (98.19 GB) DFS Remaining: 4270796800 (3.98 GB) DFS Used%: 97.11% DFS Remaining%: 0.11% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 33 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-2.xxx) Hostname: hdfs-2.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1767240470528 (1.61 TB) Non DFS Used: 108024397824 (100.61 GB) DFS Remaining: 1920359256064 (1.75 TB) DFS Used%: 46.56% DFS Remaining%: 50.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 208 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-6.xxx) Hostname: hdfs-6.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1631343329280 (1.48 TB) Non DFS Used: 107747889152 (100.35 GB) DFS Remaining: 2056532905984 (1.87 TB) DFS Used%: 42.98% DFS Remaining%: 54.18% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 216 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-15.xxx) Hostname: hdfs-15.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1888278859776 (1.72 TB) Non DFS Used: 101076938752 (94.14 GB) DFS Remaining: 1806268325888 (1.64 TB) DFS Used%: 49.75% DFS Remaining%: 47.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 190 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-12.xxx) Hostname: hdfs-12.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1545568223232 (1.41 TB) Non DFS Used: 100874694656 (93.95 GB) DFS Remaining: 2149181206528 (1.95 TB) DFS Used%: 40.72% DFS Remaining%: 56.62% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 228 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-13.xxx) Hostname: hdfs-13.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1879598047232 (1.71 TB) Non DFS Used: 100941463552 (94.01 GB) DFS Remaining: 1815084613632 (1.65 TB) DFS Used%: 49.52% DFS Remaining%: 47.82% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 206 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-9.xxx) Hostname: hdfs-9.xxx Rack: /dc13 Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 3690058039296 (3.36 TB) Non DFS Used: 105563381760 (98.31 GB) DFS Remaining: 2703360 (2.58 MB) DFS Used%: 97.22% DFS Remaining%: 0.00% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 31 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-1.xxx) Hostname: hdfs-1.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1514972078080 (1.38 TB) Non DFS Used: 728471805952 (678.44 GB) DFS Remaining: 1552180240384 (1.41 TB) DFS Used%: 39.91% DFS Remaining%: 40.89% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 205 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-10.xxx) Hostname: hdfs-10.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1584416079872 (1.44 TB) Non DFS Used: 101046075392 (94.11 GB) DFS Remaining: 2110161969152 (1.92 TB) DFS Used%: 41.74% DFS Remaining%: 55.59% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 195 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-5.xxx) Hostname: hdfs-5.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1900345761792 (1.73 TB) Non DFS Used: 108638838784 (101.18 GB) DFS Remaining: 1786639523840 (1.62 TB) DFS Used%: 50.07% DFS Remaining%: 47.07% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 205 Last contact: Fri Oct 09 12:28:55 CEST 2015 Name: x.x.x.x:50010 (hdfs-14.xxx) Hostname: hdfs-14.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1457428815872 (1.33 TB) Non DFS Used: 101012418560 (94.08 GB) DFS Remaining: 2237182889984 (2.03 TB) DFS Used%: 38.40% DFS Remaining%: 58.94% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 200 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-11.xxx) Hostname: hdfs-11.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1721947557888 (1.57 TB) Non DFS Used: 100971548672 (94.04 GB) DFS Remaining: 1972705017856 (1.79 TB) DFS Used%: 45.37% DFS Remaining%: 51.97% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 202 Last contact: Fri Oct 09 12:28:54 CEST 2015 Name: x.x.x.x:50010 (hdfs-7.xxx) Hostname: hdfs-7.xxx Rack: /default Decommission Status : Normal Configured Capacity: 3795624124416 (3.45 TB) DFS Used: 1952617472000 (1.78 TB) Non DFS Used: 103892504576 (96.76 GB) DFS Remaining: 1739114147840 (1.58 TB) DFS Used%: 51.44% DFS Remaining%: 45.82% Configured Cache Capacity: 3406823424 (3.17 GB) Cache Used: 0 (0 B) Cache Remaining: 3406823424 (3.17 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 193 Last contact: Fri Oct 09 12:28:54 CEST 2015
Created ‎10-09-2015 08:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hdfs-8.xxx belongs to rack /dc19
hdfs-9.xxx belongs to rack /dc13
The rests of the hosts are in rack /default.
This is an incorrect usage of racks. To avoid breaking rack placement the
balancer is not able to move blocks off of these two hosts.
Consider changing these two hosts to the /default rack.
David Wilder, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎10-09-2015 08:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! I actually just an hour ago came across that solution when reading this, so I already implemented the solution you suggested, but was still waiting to see if it'd balance out evenly before I posted:
http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera
Regardless, thank you very much for your help in resolving this issue.
