Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Impact of growing a Datanode Volume

avatar
Rising Star

I think I am asking a slightly different question than is here

https://community.hortonworks.com/questions/6796/how-to-increase-datanode-filesystem-size.html

but a solution should help both.

SAN issues aside!

Is there a method to expand the volume under a datanode directory and have HDFS recognize the new allocated space? For instance if we were to mount a virtual file system, say netapp, in Centos and then expand that filesystem: How would one make the change known to HDFS?

1 ACCEPTED SOLUTION

avatar
Super Guru

@wsalazar

I agree with Neeraj.

Yes, You can expand the volume under datanode directory and make it easily available in HDFS.

Two basic things you always needs to take care after increasing/extending existing volume is -

1. OS side : Make sure the new volume is reflecting with newer/extended size [ ie. in linux you can use - partprobe/kpart for lvm =resize2fs, for multipath volume =kpartx ]. Once new size is reflected on OS the HDFS automatically picks up the new size for datanodes without restart required.

2. HDFS side: For evenly distributing data across all datanodes you need to run "Rebalancer" from Cluster UI or command line.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@wsalazar

You can increase the size and on the safe side run rebalance https://wiki.apache.org/hadoop/FAQ

avatar
Super Guru

@wsalazar

I agree with Neeraj.

Yes, You can expand the volume under datanode directory and make it easily available in HDFS.

Two basic things you always needs to take care after increasing/extending existing volume is -

1. OS side : Make sure the new volume is reflecting with newer/extended size [ ie. in linux you can use - partprobe/kpart for lvm =resize2fs, for multipath volume =kpartx ]. Once new size is reflected on OS the HDFS automatically picks up the new size for datanodes without restart required.

2. HDFS side: For evenly distributing data across all datanodes you need to run "Rebalancer" from Cluster UI or command line.