Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

Solved Go to solution
Highlighted

Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

Contributor

Could not determine the age of the last HDFS checkpoint. Please ensure that you have a recent checkpoint. Otherwise, the NameNode(s) can take a very long time to start up.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

I saw this happening on a relatively idle cluster. You can create a checkpoint manually, I think the instructions are given on the dialog showing the warning, but here they are: Login to the active Namenode and run

su - hdfs
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave
3 REPLIES 3

Re: Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

I saw this happening on a relatively idle cluster. You can create a checkpoint manually, I think the instructions are given on the dialog showing the warning, but here they are: Login to the active Namenode and run

su - hdfs
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave

Re: Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

HDFS metadata consists of two parts:

  • Base filesystem table (stored in a file called fsimage)
  • The edit logs which lists changes made to the base table stored in files called edits.

Checkpointing is a process of reconciling fsimage with edits to produce a new version of fsimage. There are two benefits arising out of this:

  • A more recent version of fsimage, and a truncated edit logs.

The following properties can help to set how often Checkpointing happens:

  • dfs.namenode.checkpoint.period - Controls how often this reconciliation will be triggered. The number of seconds between two periodic checkpoints; fsimage will be updated and edit log truncated. Checkpiont is not cheap, so there is a balance between running it too often and letting the edit log grow too large. This parameter should be set to get a good balance assuming typical filesystem use in your cluster.
  • dfs.namenode.checkpoint.edits.dir - Determines where on the local filesystem the DFS secondary name node should store the temporary edits to merge. If this is a comma-delimited list of directories then the edits is replicated in all of the directories for redundancy. Default value is same as dfs.namenode.checkpoint.dir
  • dfs.namenode.checkpoint.txns - The Secondary NameNode or CheckpointNode will create a checkpoint of the namespace every 'dfs.namenode.checkpoint.txns' transactions, regardless of whether 'dfs.namenode.checkpoint.period' has expired.
  • dfs.ha.standby.checkpoints - If true, a NameNode in Standby state periodically takes a checkpoint of the namespace, saves it to its local storage and then upload to the remote NameNode.

Also, if you would like to manually checkpoint you can follow:

https://community.hortonworks.com/content/supportkb/49438/how-to-manually-checkpoint.html

Re: Could not determine the age of the last HDFS checkpoint while shutting down services using Ambari

Expert Contributor
@Sachin Ambardekar

From HDFS perspective, in some rare circumstances it was noticed that secondary (or standby) namenode fails to consume edit log. This results in more complicated situations if active namenode is restarted meanwhile (unconsumed edit logs will have to be ignored). The simpler solution to handle such scenario more gracefully is to always make sure that fsimage is updated before stopping namenode.

So as precautionary measure work was done in Ambari to check and warn user if user tries to stop NameNode that has a checkpoint older than 12 hours. [1]

HDFS-3.0.0.0 has implemented this check natively and going forward Ambari might skip this warning. [2]

Following Jira's and their description are used as references for this answer:

[1] https://issues.apache.org/jira/browse/AMBARI-12951

[2] https://issues.apache.org/jira/browse/HDFS-6353