Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Moving data to its own mountpoint

Highlighted

Moving data to its own mountpoint

New Contributor

OK - been reading up on this.... I've created a small ambari cluster and did not have an individual disk for the HDFS data to go on at install time. It appears that the data is stored in /hadoop/hdfs/data on the system disk. I now have a second disk that can be dedicated to hdfs. It is mounted (ext4) on /HDFS.

THe first question is: Is it better to just add the new filesystem and keep the existing system directory?

Second: can I just put the node in maint mode, rsync the /hadoop/hdfs/data directory to the new disk then (delete the original) mount the new disk on /hadoop/hdfs/data ?

Third, if I added in the new filesystem how would I go about removing /hadoop/hdfs/data from the storage path.

1 REPLY 1

Re: Moving data to its own mountpoint

Theoretically... yes, that should work. I'd stop HDFS as you are thinking and then get the contents of /hadoop/hdfs/data into /HDFS (might just leave it there as a fall back!!) and then update the dfs.datanode.data.dir property to now point to /HDFS instead of the /hadoop/hdfs/data default location. Using Ambari, you can find it as identified by the red arrows in the attached screenshot.

56739-dndir.jpg

After Ambari change is made and pushed to the datanodes, you can start HDFS back up and see if worked well or not. Again, theoretically should work, but if this was your production system, I'd do a dry run on another cluster (could do that on a single node psuedo-cluster) to gain some confidence that all would work well.

Good luck and happy Hadooping!

Don't have an account?
Coming from Hortonworks? Activate your account here