Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Balancer: No block has been moved for 5 iterations. Exiting.

avatar
Explorer

We run Cloudera with HBase on a cluster with 13 DataNodes (+ 2 non-DataNode nodes). 

 

Each DataNode has 3.6 TiB of space (2 equally sized disks in raid 0), and we have 24 TiB of data used by HDFS.

 

However, the data is very unevenly distributed, with two servers that are now out of disk space, and the 11 other servers using about 50% of their disk space.

 

We have about 16 "red lights" in Cloudera right now that are all caused by running out of disk space on these two serers (mostly about missing disk space for logs).

 

When we try to run the balancer, it exits after a few minutes with this error:

 

 

No block has been moved for 5 iterations. Exiting.

 

 

Is this related to the following bug? 

 

https://issues.apache.org/jira/browse/HDFS-6621

 

The fix version of that bug is 2.6.0 and we're running HDFS 2.6.0-cdh5.4.2.

 

And what can we do to fix it?

 

1 ACCEPTED SOLUTION

avatar
Community Manager
You are running with Racks defined on the hosts.

hdfs-8.xxx belongs to rack /dc19
hdfs-9.xxx belongs to rack /dc13

The rests of the hosts are in rack /default.

This is an incorrect usage of racks. To avoid breaking rack placement the
balancer is not able to move blocks off of these two hosts.

Consider changing these two hosts to the /default rack.



David Wilder, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

View solution in original post

4 REPLIES 4

avatar
Mentor
The HDFS Balancer only balances on HDFS Used % per DataNode. If your disk
space fillup is for reasons other than DFS utilised space, then that tool
will not resolve it. You can look at the 'dfsadmin -report' to see the DFS
Used % values per DN, and if they are all within 10±% of each other, then
the Balancer is not at fault here.

Have you tried checking what is consuming space on the affected hosts? Is
it the DN dirs holding the space, or is it something from YARN/etc.?

avatar
Explorer

Thank you for the answer. It's all DFS usage:

 

 

$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/md2 3,6T 3,4T 4,0G 100% /

$ du -hs /dfs/dn/ 3,4T /dfs/dn/

 

 

And it's not balanced at all, unfortunately (output from dfsadmin -report, see hdfs-8 and hdfs-9 which are the affected hosts):

 

 

Configured Capacity: 49343113617408 (44.88 TB)
Present Capacity: 47369416204288 (43.08 TB)
DFS Remaining: 21149683597312 (19.24 TB)
DFS Used: 26219732606976 (23.85 TB)
DFS Used%: 55.35%
Under replicated blocks: 16418
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (13):

Name: x.x.x.x:50010 (hdfs-8.xxx)
Hostname: hdfs-8.xxx
Rack: /dc19
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 3685917872128 (3.35 TB)
Non DFS Used: 105435455488 (98.19 GB)
DFS Remaining: 4270796800 (3.98 GB)
DFS Used%: 97.11%
DFS Remaining%: 0.11%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 33
Last contact: Fri Oct 09 12:28:54 CEST 2015


Name: x.x.x.x:50010 (hdfs-2.xxx)
Hostname: hdfs-2.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1767240470528 (1.61 TB)
Non DFS Used: 108024397824 (100.61 GB)
DFS Remaining: 1920359256064 (1.75 TB)
DFS Used%: 46.56%
DFS Remaining%: 50.59%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 208
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-6.xxx)
Hostname: hdfs-6.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1631343329280 (1.48 TB)
Non DFS Used: 107747889152 (100.35 GB)
DFS Remaining: 2056532905984 (1.87 TB)
DFS Used%: 42.98%
DFS Remaining%: 54.18%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 216
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-15.xxx)
Hostname: hdfs-15.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1888278859776 (1.72 TB)
Non DFS Used: 101076938752 (94.14 GB)
DFS Remaining: 1806268325888 (1.64 TB)
DFS Used%: 49.75%
DFS Remaining%: 47.59%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 190
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-12.xxx)
Hostname: hdfs-12.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1545568223232 (1.41 TB)
Non DFS Used: 100874694656 (93.95 GB)
DFS Remaining: 2149181206528 (1.95 TB)
DFS Used%: 40.72%
DFS Remaining%: 56.62%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 228
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-13.xxx)
Hostname: hdfs-13.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1879598047232 (1.71 TB)
Non DFS Used: 100941463552 (94.01 GB)
DFS Remaining: 1815084613632 (1.65 TB)
DFS Used%: 49.52%
DFS Remaining%: 47.82%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 206
Last contact: Fri Oct 09 12:28:54 CEST 2015


Name: x.x.x.x:50010 (hdfs-9.xxx)
Hostname: hdfs-9.xxx
Rack: /dc13
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 3690058039296 (3.36 TB)
Non DFS Used: 105563381760 (98.31 GB)
DFS Remaining: 2703360 (2.58 MB)
DFS Used%: 97.22%
DFS Remaining%: 0.00%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 31
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-1.xxx)
Hostname: hdfs-1.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1514972078080 (1.38 TB)
Non DFS Used: 728471805952 (678.44 GB)
DFS Remaining: 1552180240384 (1.41 TB)
DFS Used%: 39.91%
DFS Remaining%: 40.89%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 205
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-10.xxx)
Hostname: hdfs-10.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1584416079872 (1.44 TB)
Non DFS Used: 101046075392 (94.11 GB)
DFS Remaining: 2110161969152 (1.92 TB)
DFS Used%: 41.74%
DFS Remaining%: 55.59%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 195
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-5.xxx)
Hostname: hdfs-5.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1900345761792 (1.73 TB)
Non DFS Used: 108638838784 (101.18 GB)
DFS Remaining: 1786639523840 (1.62 TB)
DFS Used%: 50.07%
DFS Remaining%: 47.07%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 205
Last contact: Fri Oct 09 12:28:55 CEST 2015


Name: x.x.x.x:50010 (hdfs-14.xxx)
Hostname: hdfs-14.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1457428815872 (1.33 TB)
Non DFS Used: 101012418560 (94.08 GB)
DFS Remaining: 2237182889984 (2.03 TB)
DFS Used%: 38.40%
DFS Remaining%: 58.94%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 200
Last contact: Fri Oct 09 12:28:54 CEST 2015


Name: x.x.x.x:50010 (hdfs-11.xxx)
Hostname: hdfs-11.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1721947557888 (1.57 TB)
Non DFS Used: 100971548672 (94.04 GB)
DFS Remaining: 1972705017856 (1.79 TB)
DFS Used%: 45.37%
DFS Remaining%: 51.97%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 202
Last contact: Fri Oct 09 12:28:54 CEST 2015


Name: x.x.x.x:50010 (hdfs-7.xxx)
Hostname: hdfs-7.xxx
Rack: /default
Decommission Status : Normal
Configured Capacity: 3795624124416 (3.45 TB)
DFS Used: 1952617472000 (1.78 TB)
Non DFS Used: 103892504576 (96.76 GB)
DFS Remaining: 1739114147840 (1.58 TB)
DFS Used%: 51.44%
DFS Remaining%: 45.82%
Configured Cache Capacity: 3406823424 (3.17 GB)
Cache Used: 0 (0 B)
Cache Remaining: 3406823424 (3.17 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 193
Last contact: Fri Oct 09 12:28:54 CEST 2015

 

 

 

avatar
Community Manager
You are running with Racks defined on the hosts.

hdfs-8.xxx belongs to rack /dc19
hdfs-9.xxx belongs to rack /dc13

The rests of the hosts are in rack /default.

This is an incorrect usage of racks. To avoid breaking rack placement the
balancer is not able to move blocks off of these two hosts.

Consider changing these two hosts to the /default rack.



David Wilder, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Explorer

Thank you! I actually just an hour ago came across that solution when reading this, so I already implemented the solution you suggested, but was still waiting to see if it'd balance out evenly before I posted:

 

http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera

 

Regardless, thank you very much for your help in resolving this issue.