About wsanders

wsanders · ‎04-30-2015

I also bumped up the bandwith with dfsadmin (from 10 to 40Mb/sec) and the next run was quite effective.

wsanders · ‎04-29-2015

It could actually be running and just not aggressively enough. I've discovered the median size of our HDFS files is something like 6KB, so the balancer is fairly inefficient, since its execution time is the more or less the same for each block, given a fairly fast network. I'm going to have to scedule the balancer to run from cron once a day or so. So my queston is slightly modified: How can I tell when the balancer runs when the Balancer service is configured? There do not seem to be any parameters related to scheduling it. One related question: How does the balancer choose which blocks to move? Does it favor small files over large ones? The reason is because I used the output of the balancer ("moved block blah with size=..."), which includes the size of each block, as a sample of my file sizes. We actually have a slighty less than 1 to 1 blocks-to-file ratio, and of the 32000 files I sampled form the balancer run, only 2000 or were "full" blocks of 64MB.

wsanders · ‎04-29-2015

The Balancer isn't running on its own. I have run run it manually from the command line. Some of my HDFS nodes got to 97% full, whle others were only 30-something% full. It works fine from the command line. How can I check to see why it is not running on its own? As I understand it, it is supposed to do so. Thanks, w

wsanders · ‎02-06-2015

These errors (in HDFS logs not EXT4 errors in syslog) went away after about an hour.

wsanders · ‎02-05-2015

After upgrading from 5.2.1 to 5.3.1 we are seeing EXT4 errors on our HDFS/Yarm Manager nodes: DatanodeRegistration(10.15.129.105, datanodeUuid=d2f178f8-1399-4330-92b9-c627b0a04804, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=cluster15;nsid=601399481;c=0):Got exception while serving BP-447169353-10.15.129.109-1415094554153:blk_1110216188_36477305 to /10.15.129.103:56258 org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-447169353-10.15.129.109-1415094554153:blk_1110216188_36477305 The upgrade did not throw any exceptions and "hdfs fsck /" is clean. How can I flush these out of the system? Maybe these are files that did not have time to replicate properly before the upgrade?

Online	Offline
Last Visited	‎06-01-2015 08:37 PM

Member Since	‎01-14-2015 11:13 AM
Last Visited	‎06-01-2015 08:37 PM
Posts	18

Cloudera Community

Re: Replica Not Found errors after CDH 5.3.1-1 upg...

Re: HDFS Balancer isn't running by itself

Re: HDFS Balancer isn't running by itself

HDFS Balancer isn't running by itself

Re: Replica Not Found errors after CDH 5.3.1-1 upg...

Replica Not Found errors after CDH 5.3.1-1 upgrade