Currently we have /hadoop/hdfs/data and /hadoop/hdfs/data1 as datanode directories.
I have new mountpoint (/hadoop/hdfs/data/datanew) with faster disk and I want to keep only this mountpoint as datanode directory.
Steps:
Stop the cluster.
Go to the ambari HDFS configuration and edit the datanode directory configuration: Remove /hadoop/hdfs/data and /hadoop/hdfs/data1. Add /hadoop/hdfs/datanew save.
Login into each datanode VM and copy the contents of /data and /data1 into /datanew
Change the ownership of /datanew and everything under it to “hdfs”.
@Ancil McBarnett is there anyway to do this without downtime? Could you add a disk drive into a hot-swappable bay, add it to DataNode's list of directories, force a rebalance, and remove one of the old drives?
@Vladimir Zlatkin should work as well. You can add a new drive, mount it and add the new mount point to the list of HDFS directories. If you have a lot of drives or mount points that you need to change, I'd probably decommission the Datanode and re-commission it once the changes are finished. Keep in mind that the latter can cause some additional network traffic.
The only challenge that I encountered was the :port: in the command. It is the dfs.datanode.ipc.address parameter from hdfs-site.xml. My full command looked like this
su - hdfs -c "hdfs dfsadmin -reconfig datanode sandbox.hortonworks.com:8010 start"