Reply
New Contributor
Posts: 5
Registered: ‎07-20-2017

Checkpointing in a HDFS HA environment

I've recently set up a HA cluster in my test environment.

I was trying to take a checkpoint on my HDFS filesystem using the "hdfs namenode -checkpoint" command.

It did not work as it said the directory: "/tmp/hadoop-hdfs/dfs/name" is "in an inconsistent state", I checked and the folder did not exist, after I proceeded to create it I received the following error:

java.lang.IllegalStateException: Unexpected state: OPEN_FOR_READING
	at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:245)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1235)
	at org.apache.hadoop.hdfs.server.namenode.BackupNode$BNHAContext.startActiveServices(BackupNode.java:471)
	at org.apache.hadoop.hdfs.server.namenode.BackupState.enterState(BackupState.java:51)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:847)
	at org.apache.hadoop.hdfs.server.namenode.BackupNode.<init>(BackupNode.java:89)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1523)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1610)

I did not find any documentation as to what the solution to the problem might be.

I have an assumption that manual checkpointing might be disabled in a HA environment.

Any help regarding this topic and HDFS metadata backups (specifically in a HA cluster) would be very helpful

Thanks!

 

Explorer
Posts: 22
Registered: ‎01-08-2016

Re: Checkpointing in a HDFS HA environment

Is your cluster and hdfs working fine? Are able to ingest data and run jobs?

Check the fsimage and edit log location in NN configuration.

New Contributor
Posts: 5
Registered: ‎07-20-2017

Re: Checkpointing in a HDFS HA environment

Yes, the cluster is working fine.
the fsimage location is defined at /dfs/nn
the edit log location variable was empty but the description lists that it should point to the same path as the variable "NameNode Data Directories"
just to make sure i changed the edit log directory path variable so it has a value instead of using the defaults and the error i listed earlier still persists.

Posts: 519
Topics: 14
Kudos: 92
Solutions: 45
Registered: ‎09-02-2016

Re: Checkpointing in a HDFS HA environment

@DataK1ng

 

I would recommend you to take a copy of your name node before any manual activity on it. 

 

You may try this and use active/inactive nodes instead of primary/secondary name nodes mentioned in the article 

https://community.hortonworks.com/content/supportkb/49438/how-to-manually-checkpoint.html

New Contributor
Posts: 5
Registered: ‎07-20-2017

Re: Checkpointing in a HDFS HA environment

Thanks for your response however it wont work since the "hdfs -secondarynamenode" command doesn't work when you have an HA cluster.

putting the active namenode in safemode in order to checkpoint it as stated in the first part of the link you gave me is not a plausible solution for the cluster im working on.
it might be realistic in non-production environments that can handle downtime.
New Contributor
Posts: 3
Registered: ‎04-23-2017

Re: Checkpointing in a HDFS HA environment

As the official document:

 

Note that, in an HA cluster, the Standby NameNode also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.

You should not deploy a checkpoint node or backup node in a HA cluster.Standby NameNode already do so.