Support Questions
Find answers, ask questions, and share your expertise

CDh 4.7 HA namenodes won't start

CDh 4.7 HA namenodes won't start

Explorer

Ok no laughing.

    I have inherited a CDH 4.7 cluster and last week someone decided it would be a good idea to do kernel updates and then just reboot all nodes at once (bang head here).  This of course caused a huge issue and I can't start either the active or standby namenodes.  We can see from below it does the Fast-forwarding thru the streams and then just fails.   Any idea how I can go about fixing?

   Note this was never installed with CM but has been running for years until now

 

========

2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Edits file http://hdpdq05b.int:8480/getJournal?jid=DQcluster&segmentTxId=2516827520&storageInfo=-40%3A162882830..., http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516827520&storageInfo=-40%3A162882830... of size 296750 edits # 2413 loaded in 0 seconds
2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@16f68d93 expecting start txid #2516829933
2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A162882830..., http://hdpdq05b.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A162882830...' to transaction ID 2516478939
2020-08-16 10:34:23,310 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A162882830...' to transaction ID 2516478939
2020-08-16 10:34:23,322 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Edits file http://hdpdq05a.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A162882830..., http://hdpdq05b.int:8480/getJournal?jid=DQcluster&segmentTxId=2516829933&storageInfo=-40%3A162882830... of size 1048576 edits # 407 loaded in 0 seconds
2020-08-16 10:34:23,393 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 530 entries 107888 lookups
2020-08-16 10:34:23,404 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 14078 msecs
2020-08-16 10:34:23,595 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2020-08-16 10:34:23,617 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2020-08-16 10:34:23,619 WARN org.apache.hadoop.hdfs.server.common.Util: Path /disk1/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2020-08-16 10:34:23,639 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
Connection to hdpdq05b closed by remote host.undException: File does not exist: /acs/prod/data/B/2020/08/12/1455/out/_logs/history/job_202008120816_0353_1597244146771_mapred_oozie%3AactionConnection to hdpdq05b closed.cs-B-wf1%3AA%3Dphase
-bash-4.1$ org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4487)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setBlockTotal(FSNamesystem.java:4458)

=====

Don't have an account?