Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NNStorageRetentionManager not purging fsimages from all "dfs.namenode.name.dir" directories

avatar
Rising Star

Hello,

I am seeing an issue with fsimage files not being cleaned away from one of the "dfs.namenode.name.dir" directories. The setting of "dfs.namenode.name.dir" in our cluster is "/tmp/hadoop/hdfs/namenode,/var/hadoop/hdfs/namenode,/mnt/data/hadoop/hdfs/namenode". This fills up the /tmp partition on the host hosting the namenode.

Listing the contents of these folders show that the /tmp folder contains a lot more fsimage files than the other two folders:

[me@node ~]$ ls -la /tmp/hadoop/hdfs/namenode/current | grep fsimage | wc -l
94
[me@node ~]$ ls -la /var/hadoop/hdfs/namenode/current | grep fsimage | wc -l
9
[me@node ~]$ ls -la /mnt/data/hadoop/hdfs/namenode/current | grep fsimage | wc -l
9

Looking at the namenode logs confirms that the purging seems to only happen for /var and /mnt:

[me@node ~]$ grep NNStorageRetentionManager /var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log* | grep fsimage/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.7:2016-06-27 19:50:25,462 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002281385227, cpktTxId=0000000002281385227)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.7:2016-06-27 19:50:25,640 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002281385227, cpktTxId=0000000002281385227)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.8:2016-06-27 18:38:58,921 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002280372072, cpktTxId=0000000002280372072)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.8:2016-06-27 18:38:59,102 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002280372072, cpktTxId=0000000002280372072)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.9:2016-06-27 17:34:31,800 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002279353884, cpktTxId=0000000002279353884)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.9:2016-06-27 17:34:31,992 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002279353884, cpktTxId=0000000002279353884)

Can anyone explain why only two directories are purged?

I should mention that we are running namenode HA.

Best Regards

/Thomas

1 ACCEPTED SOLUTION

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

Hi Artem.

I agree that /tmp is just plain wrong for this. I think Ambari chose these directories for us during cluster installation and we haven't noticed. We will remove /tmp from this configuration.