Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Detected data dir(s) that became unmounted and are now writing to the root partition

avatar
Expert Contributor

So, I was installing a new cluster for our development QA testing and failed to adjust the default configuration that Ambari gave me for dfs.datanode.data.dir. It ended up putting in every partition it found. I really only wanted one - /grid/1, which was a dedicated disk partition for HDFS block storage. I discovered this blunder after the complete installation completed. I did not want to just redo the whole install, opting instead to try to manually fix this as a good deep dive learning experience. I got it all worked out, and it was a good learning exercise, but I have one lingering issue that I cannot solve. I am getting 4 Ambari errors (one for each DN) that state:

Detected data dir(s) that became unmounted and are now writing to the root partition: /grid/1/hadoop/hdfs/data .

I figured out that Ambari agents monitor (and remember) which HDFS directories were previously mounted and it checks to see if a mounted disk goes away for some reason and displays an error. That's all fine - I get that. However, my setup seems to be correct yet Ambari is still complaining.

As part of correcting the configuration issue I laid upon myself, I did edit the following file (on each DN) to remove all those previously caches mount points that I did not want and I just left the one I did want. I ended up just stopping HDFS, removing all the /opt/hadoop/hdfs, //tmp/hadoop/hdfs, etc directories, removing the name node metadata directories, reformatting the namenode, and starting up HDFS. The file system is up an working.

But, can anyone tell me why I cannot get rid of this Ambari error?

Here's the contents of one of dfs_data_dir_mount.hist files. All 4 are exactly the same. Below showa the mount where I have a disk for HDFS data storage. It all looks good. I must be missing something obvious. I did restart everything - nothing clears this error.

Thanks in advance...

[root@vmwqsrqadn01 ~]# cat /var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist
# This file keeps track of the last known mount-point for each DFS data dir.
# It is safe to delete, since it will get regenerated the next time that the DataNode starts.
# However, it is not advised to delete this file since Ambari may
# re-create a DFS data dir that used to be mounted on a drive but is now mounted on the root.
# Comments begin with a hash (#) symbol
# data_dir,mount_point
/grid/1/hadoop/hdfs/data,/grid/1

[root@vmwqsrqadn01 ~]# mount -l | grep grid
/dev/sdb1 on /grid/1 type ext4 (rw,noatime,nodiratime,seclabel,data=ordered)
1 ACCEPTED SOLUTION

avatar
Expert Contributor

We'll, I would still love to understand how this worked but, a reboot of the 4 DNs made this error go away. Never would have thunk it! That's pretty strange... Anyway, maybe this will help some other poor soul who hits this same condition. 🙂

View solution in original post

5 REPLIES 5

avatar
Expert Contributor

We'll, I would still love to understand how this worked but, a reboot of the 4 DNs made this error go away. Never would have thunk it! That's pretty strange... Anyway, maybe this will help some other poor soul who hits this same condition. 🙂

avatar
Master Guru

Not sure, but I guess restarting ambari-agent's did it.

avatar
Expert Contributor

Actually, I tried restarting them before the reboot. Restarted everything. Still had the errors. Then did the reboot and they cleared. Oh well, is what it is!

avatar
Contributor

I ran into this same issue, but unlike the original poster, restarting the Ambari agents on the data nodes was sufficient to clear the alarm.

avatar
Contributor

And that said, I actually restarted Ambari as well - so I can't say for certain that the agent restart was sufficient; it may well have been the agents plus Ambari which did the trick.