Created 10-23-2017 04:22 PM
we are trying to start the "Standby NameNode (HDFS)" on master01 machine in ambari cluster version 2.6
and we cant start it
we get the following logs:
ERROR namenode.NameNode (NameNode.java:main(1774)) - Failed to start namenode. org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 13361263 at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:203) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:693) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:289) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1045) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:703) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:688) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:752) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:992) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:976) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1701) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769) Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 13361262; expected file to go up to 13361312
what chould be the problem m and how to fix , so the service will start?
Created 10-23-2017 05:05 PM
1. Was this Node working fine earlier?
2. Do you have the correct "etc/hosts" entry in upper case or mixed case.
3. Is this a kerberized cluster?
Created 10-24-2017 08:27 AM
we found this workaround - hadoop namenode -recover , is it other soultion for our problem ?
Created 10-24-2017 09:20 AM
hi jay just to clarify your instructions ( and regarding that standby name node on master01 and active node on master02 can you approve this steps
su - hdfs
hdfs dfsadmin -safemode leave ( on master02 )
cp -rp /hadoop/hdfs/journal/hdfsha/current /hadoop/hdfs/journal/hdfsha/current.orig ( on master02 )
rm -f /hadoop/hdfs/journal/hdfsha/current/* ( on master02 )
hdfs namenode -bootstrapStandby ( on master01 )
Created 12-17-2021 08:00 AM
Created 12-29-2017 07:26 AM
I had created a new HA enabled cluster on EC2 instances. It came up without any issues. I then installed kerberos on two machines as master and slave, and then started the kerberos enable ambari wizard. At the last step of starting the services, Namenode start failed on both the nodes and gave same error as above. How this error came on a new setup and how to prevent it from coming up in my next environment setup?
Created 12-29-2017 07:25 AM
I had created a new HA enabled cluster on EC2 instances. It came up without any issues. I then installed kerberos on two machines as master and slave, and then started the kerberos enable ambari wizard. At the last step of starting the services, Namenode start failed on both the nodes and gave same error as above. How this error came on a new setup and how to prevent it from coming up in my next environment setup?