Support Questions

danny1 · ‎03-24-2016

Hello,

We're running a cluster with 12 DataNode servers, each has 12 physical disks mounted as follows:

/dev/sda5 on /grid/0 type ext4 (rw,noatime)
/dev/sdb1 on /grid/1 type ext4 (rw,noatime)
/dev/sdc1 on /grid/2 type ext4 (rw,noatime)
/dev/sdd1 on /grid/3 type ext4 (rw,noatime)
/dev/sde1 on /grid/4 type ext4 (rw,noatime)
/dev/sdf1 on /grid/5 type ext4 (rw,noatime)
/dev/sdg1 on /grid/6 type ext4 (rw,noatime)
/dev/sdh1 on /grid/7 type ext4 (rw,noatime)
/dev/sdi1 on /grid/8 type ext4 (rw,noatime)
/dev/sdj1 on /grid/9 type ext4 (rw,noatime)
/dev/sdk1 on /grid/10 type ext4 (rw,noatime)
/dev/sdl1 on /grid/11 type ext4 (rw,noatime)

We've tried adding 5 newly deployed DataNodes which are stronger and greater by all means (capacity, cpu & ram) to the cluster with the following disk layout:

/dev/sda1 on /grid/0 type ext4 (rw,noatime)
/dev/sdb1 on /grid/1 type ext4 (rw,noatime)
/dev/sdc1 on /grid/2 type ext4 (rw,noatime)
/dev/sdd1 on /grid/3 type ext4 (rw,noatime)
/dev/sde1 on /grid/4 type ext4 (rw,noatime)
/dev/sdf1 on /grid/5 type ext4 (rw,noatime)
/dev/sdg1 on /grid/6 type ext4 (rw,noatime)
/dev/sdh1 on /grid/7 type ext4 (rw,noatime)
/dev/sdi1 on /grid/8 type ext4 (rw,noatime)
/dev/sdj1 on /grid/9 type ext4 (rw,noatime)
/dev/sdk1 on /grid/10 type ext4 (rw,noatime)
/dev/sdl1 on /grid/11 type ext4 (rw,noatime)
/dev/sdm1 on /grid/12 type ext4 (rw,noatime)
/dev/sdn1 on /grid/13 type ext4 (rw,noatime)
/dev/sdo1 on /grid/14 type ext4 (rw,noatime)
/dev/sdp1 on /grid/15 type ext4 (rw,noatime)

New DataNodes have 4 disks extra so we added /grid/12/hadoop/hdfs/data, /grid/13/hadoop/hdfs/data, /grid/14/hadoop/hdfs/data, /grid/15/hadoop/hdfs/data to DataNode directories in HDFS config.

Everywhere we searched, was written that in case directories does not exist, they will be ignored (because previous DataNodes lacks /grid/12,13,14,15 mount points).

What actually happened is, on previous DataNodes under /grid folders (12/,13/,14/,15/) were created and are filling up with HDFS data. Since they are mounted on / (and not on a dedicated block device), space is about to run out which is probably, not a good thing.

How to proceed now? How to remove the data which landed there to free root (/) partition space?

Thanks,

shishir_saxena4 · ‎03-26-2016

@auto gun Look into Host Groups to manage hosts with different configurations.

https://developer.ibm.com/hadoop/blog/2015/11/10/override-component-configurations-with-ambari-confi...

You can create one host group with 12 disks and 2nd host group with 16 disks. Once your groups are correctly added, any missing data on initial nodes in 12/,13/,14/,15 will be redistributed to new nodes by HDFS. At that point you can free your space.

View solution in original post

shishir_saxena4 · ‎03-26-2016

@auto gun Look into Host Groups to manage hosts with different configurations.

https://developer.ibm.com/hadoop/blog/2015/11/10/override-component-configurations-with-ambari-confi...

You can create one host group with 12 disks and 2nd host group with 16 disks. Once your groups are correctly added, any missing data on initial nodes in 12/,13/,14/,15 will be redistributed to new nodes by HDFS. At that point you can free your space.

danny1 · ‎03-28-2016

I've followed your explanation and found it to work on our setup.

Thanks @Shishir Saxena!

danny1 · ‎03-27-2016

@Shishir Saxena Thanks for your input.

Right now I don't understand how to remove /grid/1{2..5} from first 12 datanodes.

Can I just 'rm -rf' these folders?

Cloudera Community

Support Questions

Newly added DataNodes won't joining the party