Reply
Highlighted
Explorer
Posts: 8
Registered: ‎03-14-2018
Accepted Solution

Process to Start StandBy NameNode

Hi everyone,

 

My system have 2 datanode, 2  namenode, 3 journalnode, 3 zookeeper service 

 

I had config cluster namenode ok , when browsing the admin page namenode:50070 , I had see 1 name node status (active) and one namenode status (standby). => OK 

 

When I stop active namenode the other with standby become active .  => OK

 

But the problem is how start the namenode which I had stop again ?

 

I do the following :

 

sudo -u hdfs hdfs namenode -bootstrapStandby -force

/etc/init.d/hadoop-hdfs-namenode start

With above process sometime namenode start ok with standby mode , but sometime it start with active mode and then I have 2 active node (split brain !!)

 

So what I have wrong , what is the right process to start a namenode had stop again

 

Thanks you

 

 

 

 

 

Posts: 1,754
Kudos: 371
Solutions: 279
Registered: ‎07-31-2013

Re: Process to Start StandBy NameNode

[ Edited ]

A HA HDFS installation requires you to run Failover Controllers on each of
the NameNode, along with a ZooKeeper service. These controllers take care
of transitioning NameNodes such that only one is active and the other
becomes standby.

It appears that you're using a CDH package based (non-CM) installation
here, so please follow the guide starting at
https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_hag_hdfs_ha_intro.html#topic_2_1...,
following instructions that are under the 'Command-Line' parts instead of
Cloudera Manager ones.

 


@phaothu wrote:

But the problem is how start the namenode which I had stop again ?

 

I do the following :

 

sudo -u hdfs hdfs namenode -bootstrapStandby -force

/etc/init.d/hadoop-hdfs-namenode start

With above process sometime namenode start ok with standby mode , but sometime it start with active mode and then I have 2 active node (split brain !!)

 

So what I have wrong , what is the right process to start a namenode had stop again

 

 


Just simply start it up. The bootstrap command must only be run if its a fresh new NameNode, not every restart of a previously running NameNode.

 

Its worth noting that Standby and Active are just states of the very same NameNode. The StandbyNameNode is not a special daemon, its just a state of the NameNode.

Posts: 518
Topics: 14
Kudos: 87
Solutions: 45
Registered: ‎09-02-2016

Re: Process to Start StandBy NameNode

@phaothu

 

To do via CM, 

 

Login as admin to CM -> HDFS -> Instances -> 'Federation and high availability' button -> Action -> Manual Failover

Explorer
Posts: 8
Registered: ‎03-14-2018

Re: Process to Start StandBy NameNode

Yeap as @Harsh J said I am using a CDH package based (non-CM) installation.  I will show more about my config 

I have 3 nodes: node 1 , node 2 , node 3 

 

zookeeper in 3 nodes: 3 nodes with the same config

 

maxClientCnxns=50
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
clientPort=2181
dataLogDir=/var/lib/zookeeper

 

 

hdfs-site.xml in 3 nodes:

 

<property>
  <name>dfs.nameservices</name>
  <value>mycluster</value>
</property>

<property>
  <name>dfs.ha.namenodes.mycluster</name>
  <value>node1,node2</value>
</property>

<property>
  <name>dfs.namenode.rpc-address.mycluster.node1</name>
  <value>node1:8020</value>
</property>

<property>
  <name>dfs.namenode.rpc-address.mycluster.node2</name>
  <value>node2:8020</value>
</property>


<property>
  <name>dfs.namenode.http-address.mycluster.node1</name>
  <value>node1:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.mycluster.node2</name>
  <value>node2:50070</value>
</property>


<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value>
</property>

<property>
  <name>dfs.client.failover.proxy.provider.mycluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>

<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/namenode/dfs/jn</value>
</property>

core-site.xml in 3 nodes:

 

 

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://mycluster</value>
</property>

<property>
  <name>ha.zookeeper.quorum</name>
  <value>node1:2181,node2:2181,node3:2181</value>
</property>

 

Above is config relate to cluster namnode. I doupt  about zookeeper config , does is it enough ? 

 

Service install on each node

Node 1: 

hadoop-hdfs-journalnode  hadoop-hdfs-namenode    hadoop-hdfs-zkfc  zookeeper-server

Node 2:

hadoop-hdfs-journalnode  hadoop-hdfs-namenode    hadoop-hdfs-zkfc  zookeeper-server

Node 3: 

hadoop-hdfs-journalnode  zookeeper-server

 

The firt time initial is ok: 

Node 1 active , Node 2 standby

 

Stop namenode service on Node 1 => node 2 active => OK 

 

But when start service NameNode on Node 1 again 

 

node 1 active And node 2 active too => fail 

 

 

 

Explorer
Posts: 8
Registered: ‎03-14-2018

Re: Process to Start StandBy NameNode

Thanks @saranvisa but I am  not using  CM , just install by command line.

Posts: 1,754
Kudos: 371
Solutions: 279
Registered: ‎07-31-2013

Re: Process to Start StandBy NameNode

@phaothu,

> My system have 2 datanode, 2 namenode, 3 journalnode, 3 zookeeper service

To repeat, you need to run the ZKFailoverController daemons in addition to this setup. Please see the guide linked in my previous post and follow it entirely for the command-line setup.

Running just ZK will not grant you a HDFS HA solution - you are missing a crucial daemon that interfaces between ZK and HDFS.
Explorer
Posts: 8
Registered: ‎03-14-2018

Re: Process to Start StandBy NameNode

Dear @Harsh J ,

 

 

Does 'Automatic Failover Configuration' need config 'Fencing Configuration' , It is   2 dependent section or I need both to config  Automatic Failover .

 

Because I met this error

 

ou must configure a fencing method before using automatic failover.
org.apache.hadoop.ha.BadFencingConfigurationException: No fencer configured for NameNode at node1/x.x.x.x:8020
	at org.apache.hadoop.hdfs.tools.NNHAServiceTarget.checkFencingConfigured(NNHAServiceTarget.java:132)
	at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:225)
	at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:60)
	at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:171)
	at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:167)
	at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:444)
	at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:167)
	at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:192)
2018-09-10 15:56:53,262 INFO org.apache.zookeeper.ZooKeeper: Session: 0x365c2b22a1e0000 closed
2018-09-10 15:56:53,262 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down

 

If I need both of them , so 

 

<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
</property>

What is the 

/path/to/my/script.sh

 

The contain of this script I am not clear about this , pls explain and may be give me an example .

 

Thanks you  

 

Posts: 1,754
Kudos: 371
Solutions: 279
Registered: ‎07-31-2013

Re: Process to Start StandBy NameNode

The fencing config requirement still exists, and you could configure a valid fencer if you wish to, but with Journal Nodes involved you can simply use the following as your fencer, as the QJMs fence the NameNodes by crashing them due to a single elected writer model:

<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
</property>
Explorer
Posts: 8
Registered: ‎03-14-2018

Re: Process to Start StandBy NameNode

@Harsh J yeap with 

 

<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/bin/true)</value>
</property>

It working perfect now 

 

Thanks you very much 

Announcements