Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NNStorageRetentionManager not purging fsimages from all "dfs.namenode.name.dir" directories

avatar
Rising Star

Hello,

I am seeing an issue with fsimage files not being cleaned away from one of the "dfs.namenode.name.dir" directories. The setting of "dfs.namenode.name.dir" in our cluster is "/tmp/hadoop/hdfs/namenode,/var/hadoop/hdfs/namenode,/mnt/data/hadoop/hdfs/namenode". This fills up the /tmp partition on the host hosting the namenode.

Listing the contents of these folders show that the /tmp folder contains a lot more fsimage files than the other two folders:

[me@node ~]$ ls -la /tmp/hadoop/hdfs/namenode/current | grep fsimage | wc -l
94
[me@node ~]$ ls -la /var/hadoop/hdfs/namenode/current | grep fsimage | wc -l
9
[me@node ~]$ ls -la /mnt/data/hadoop/hdfs/namenode/current | grep fsimage | wc -l
9

Looking at the namenode logs confirms that the purging seems to only happen for /var and /mnt:

[me@node ~]$ grep NNStorageRetentionManager /var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log* | grep fsimage/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.7:2016-06-27 19:50:25,462 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002281385227, cpktTxId=0000000002281385227)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.7:2016-06-27 19:50:25,640 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002281385227, cpktTxId=0000000002281385227)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.8:2016-06-27 18:38:58,921 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002280372072, cpktTxId=0000000002280372072)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.8:2016-06-27 18:38:59,102 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002280372072, cpktTxId=0000000002280372072)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.9:2016-06-27 17:34:31,800 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/var/hadoop/hdfs/namenode/current/fsimage_0000000002279353884, cpktTxId=0000000002279353884)/var/log/hadoop/hdfs/hadoop-hdfs-namenode-node.log.9:2016-06-27 17:34:31,992 INFO  namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:purgeImage(225)) - Purging old image FSImageFile(file=/mnt/data/hadoop/hdfs/namenode/current/fsimage_0000000002279353884, cpktTxId=0000000002279353884)

Can anyone explain why only two directories are purged?

I should mention that we are running namenode HA.

Best Regards

/Thomas

1 ACCEPTED SOLUTION

avatar
Master Mentor

storing fsimage in /tmp makes no sense, I would remove that directory from your hdfs-site. You need multiple directories for redundancy whereas anything in tmp will disappear as soon as machine reboots. tmp directory does not operate the same way as other directories and it not purging files same way as others do is irrelevant

View solution in original post

2 REPLIES 2

avatar
Master Mentor

storing fsimage in /tmp makes no sense, I would remove that directory from your hdfs-site. You need multiple directories for redundancy whereas anything in tmp will disappear as soon as machine reboots. tmp directory does not operate the same way as other directories and it not purging files same way as others do is irrelevant

avatar
Rising Star

Hi Artem.

I agree that /tmp is just plain wrong for this. I think Ambari chose these directories for us during cluster installation and we haven't noticed. We will remove /tmp from this configuration.