Support Questions
Find answers, ask questions, and share your expertise

Moving data to its own mountpoint


OK - been reading up on this.... I've created a small ambari cluster and did not have an individual disk for the HDFS data to go on at install time. It appears that the data is stored in /hadoop/hdfs/data on the system disk. I now have a second disk that can be dedicated to hdfs. It is mounted (ext4) on /HDFS.

THe first question is: Is it better to just add the new filesystem and keep the existing system directory?

Second: can I just put the node in maint mode, rsync the /hadoop/hdfs/data directory to the new disk then (delete the original) mount the new disk on /hadoop/hdfs/data ?

Third, if I added in the new filesystem how would I go about removing /hadoop/hdfs/data from the storage path.


Theoretically... yes, that should work. I'd stop HDFS as you are thinking and then get the contents of /hadoop/hdfs/data into /HDFS (might just leave it there as a fall back!!) and then update the property to now point to /HDFS instead of the /hadoop/hdfs/data default location. Using Ambari, you can find it as identified by the red arrows in the attached screenshot.


After Ambari change is made and pushed to the datanodes, you can start HDFS back up and see if worked well or not. Again, theoretically should work, but if this was your production system, I'd do a dry run on another cluster (could do that on a single node psuedo-cluster) to gain some confidence that all would work well.

Good luck and happy Hadooping!

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.