Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Checkpointing in a HDFS HA environment

Checkpointing in a HDFS HA environment

New Contributor

I've recently set up a HA cluster in my test environment.

I was trying to take a checkpoint on my HDFS filesystem using the "hdfs namenode -checkpoint" command.

It did not work as it said the directory: "/tmp/hadoop-hdfs/dfs/name" is "in an inconsistent state", I checked and the folder did not exist, after I proceeded to create it I received the following error:

java.lang.IllegalStateException: Unexpected state: OPEN_FOR_READING
	at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:245)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1235)
	at org.apache.hadoop.hdfs.server.namenode.BackupNode$BNHAContext.startActiveServices(BackupNode.java:471)
	at org.apache.hadoop.hdfs.server.namenode.BackupState.enterState(BackupState.java:51)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:847)
	at org.apache.hadoop.hdfs.server.namenode.BackupNode.<init>(BackupNode.java:89)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1523)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1610)

I did not find any documentation as to what the solution to the problem might be.

I have an assumption that manual checkpointing might be disabled in a HA environment.

Any help regarding this topic and HDFS metadata backups (specifically in a HA cluster) would be very helpful

Thanks!

 

5 REPLIES 5

Re: Checkpointing in a HDFS HA environment

Explorer

Is your cluster and hdfs working fine? Are able to ingest data and run jobs?

Check the fsimage and edit log location in NN configuration.

Re: Checkpointing in a HDFS HA environment

New Contributor
Yes, the cluster is working fine.
the fsimage location is defined at /dfs/nn
the edit log location variable was empty but the description lists that it should point to the same path as the variable "NameNode Data Directories"
just to make sure i changed the edit log directory path variable so it has a value instead of using the defaults and the error i listed earlier still persists.

Re: Checkpointing in a HDFS HA environment

Champion

@DataK1ng

 

I would recommend you to take a copy of your name node before any manual activity on it. 

 

You may try this and use active/inactive nodes instead of primary/secondary name nodes mentioned in the article 

https://community.hortonworks.com/content/supportkb/49438/how-to-manually-checkpoint.html

Re: Checkpointing in a HDFS HA environment

New Contributor
Thanks for your response however it wont work since the "hdfs -secondarynamenode" command doesn't work when you have an HA cluster.

putting the active namenode in safemode in order to checkpoint it as stated in the first part of the link you gave me is not a plausible solution for the cluster im working on.
it might be realistic in non-production environments that can handle downtime.

Re: Checkpointing in a HDFS HA environment

New Contributor

As the official document:

 

Note that, in an HA cluster, the Standby NameNode also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.

You should not deploy a checkpoint node or backup node in a HA cluster.Standby NameNode already do so.