- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Standby NameNode cant start in ambari cluster
- Labels:
-
Apache Ambari
-
Apache Hadoop
Created ‎10-23-2017 04:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we are trying to start the "Standby NameNode (HDFS)" on master01 machine in ambari cluster version 2.6
and we cant start it
we get the following logs:
ERROR namenode.NameNode (NameNode.java:main(1774)) - Failed to start namenode. org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 13361263 at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:203) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:693) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:289) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1045) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:703) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:688) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:752) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:992) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:976) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1701) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769) Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 13361262; expected file to go up to 13361312
what chould be the problem m and how to fix , so the service will start?
Created ‎10-23-2017 05:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. Was this Node working fine earlier?
2. Do you have the correct "etc/hosts" entry in upper case or mixed case.
3. Is this a kerberized cluster?
Created ‎10-23-2017 05:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. Was this Node working fine earlier?
2. Do you have the correct "etc/hosts" entry in upper case or mixed case.
3. Is this a kerberized cluster?
Created ‎10-23-2017 05:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Was this Node working fine earlier - yes
Created ‎10-23-2017 05:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you have the correct "etc/hosts" - yes
Created ‎10-23-2017 05:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is this a kerberized cluster? - what you mean we have 3 masters machine + 2 workers machines
Created ‎10-23-2017 05:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If it is test cluster , then you may try the following (At your own risk)
1. If your Active NameNode is running fine then you can try to bring safe mode out with forceExit on Active NN.
2. Then take a backup of the directory "/hadoop/hdfs/namenode/current"
3. After taking the backup mentioned in previous step, please remove the directory contents "/hadoop/hdfs/namenode/current/*"
4. Perform the bootstrapStandby.
.
Created ‎10-23-2017 06:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
can you please show me how to bring safe mode out with forceExit on Active NN , bootstrapStandby.
Created ‎10-23-2017 06:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-23-2017 06:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
my active node is on master02 so I need to do the steps on master02 machine? ( include backup )
Created ‎10-24-2017 06:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi Jay still waiting to your answer , do you mean to do hdfs dfsadmin -safemode leave on the namenode that is runing ( master02 ) ? ( standby namenode is on master01 ) , second - Then take a backup of the directory "/hadoop/hdfs/namenode/current" should be on the active name node ? , I ask this because it is more logic to do this on standby ( master01 machine )
