Hi All,
I have a 3 node HDP cluster and on which 2 nodes act as namenode as data node. We have recently extended the disk of HDFS from 1 tb to 3 TB. After that system got reboot and since that my name nodes are comming up.
Name node logs are mentioned below
2022-05-01 14:00:49,272 INFO namenode.FSNamesystem (FSNamesystem.java:initRetryCache(979)) - Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2022-05-01 14:00:49,274 INFO util.GSet (LightWeightGSet.java:computeCapacity(395)) - Computing capacity for map NameNodeRetryCache
2022-05-01 14:00:49,274 INFO util.GSet (LightWeightGSet.java:computeCapacity(396)) - VM type = 64-bit
2022-05-01 14:00:49,275 INFO util.GSet (LightWeightGSet.java:computeCapacity(397)) - 0.029999999329447746% max memory 2.0 GB = 621.3 KB
2022-05-01 14:00:49,275 INFO util.GSet (LightWeightGSet.java:computeCapacity(402)) - capacity = 2^16 = 65536 entries
2022-05-01 14:00:49,299 INFO common.Storage (Storage.java:tryLock(776)) - Lock on /grid1/hadoop/hdfs/namenode/in_use.lock acquired by nodename 3716@digaudanaqamn2.gt.com
2022-05-01 14:00:49,354 INFO common.Storage (Storage.java:tryLock(776)) - Lock on /mnt/resource/hadoop/hdfs/namenode/in_use.lock acquired by nodename 3716@digaudanaqamn2.gt.com
2022-05-01 14:00:49,354 INFO namenode.FSImage (FSImage.java:recoverTransitionRead(277)) - Storage directory /mnt/resource/hadoop/hdfs/namenode is not formatted.
2022-05-01 14:00:49,354 INFO namenode.FSImage (FSImage.java:recoverTransitionRead(278)) - Formatting ...
2022-05-01 14:00:49,354 INFO common.Storage (Storage.java:clearDirectory(340)) - Will remove files: []
2022-05-01 14:00:49,355 WARN namenode.FSImage (NNStorage.java:readAndInspectDirs(1049)) - Storage directory Storage Directory /mnt/resource/hadoop/hdfs/namenode contains no VERSION file. Skipping...
2022-05-01 14:00:49,379 INFO namenode.FSImageTransactionalStorageInspector (FSImageTransactionalStorageInspector.java:inspectDirectory(85)) - No version file in /mnt/resource/hadoop/hdfs/namenode
2022-05-01 14:00:49,773 INFO namenode.FSImage (FSImage.java:loadFSImageFile(745)) - Planning to load image: FSImageFile(file=/grid1/hadoop/hdfs/namenode/current/fsimage_0000000000029731317, cpktTxId=0000000000029731317)
2022-05-01 14:00:49,868 INFO namenode.FSImageFormatPBINode (FSImageFormatPBINode.java:loadINodeSection(257)) - Loading 187768 INodes.
2022-05-01 14:00:50,935 INFO namenode.FSImageFormatProtobuf (FSImageFormatProtobuf.java:load(184)) - Loaded FSImage in 1 seconds.
2022-05-01 14:00:50,935 INFO namenode.FSImage (FSImage.java:loadFSImage(911)) - Loaded image for txid 29731317 from /grid1/hadoop/hdfs/namenode/current/fsimage_0000000000029731317
2022-05-01 14:00:50,940 INFO namenode.FSImage (FSImage.java:loadEdits(849)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@52eacb4b expecting start txid #29731318
2022-05-01 14:00:50,941 INFO namenode.FSImage (FSEditLogLoader.java:loadFSEdits(142)) - Start loading edits file http://digaudanaqamn2.gt.com:8480/getJournal?jid=gtqa&segmentTxId=29606282&storageInfo=-63%3A8977857..., http://digaudanaqamn3.gt.com:8480/getJournal?jid=gtqa&segmentTxId=29606282&storageInfo=-63%3A8977857...
2022-05-01 14:00:50,946 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream 'http://digaudanaqamn2.gt.com:8480/getJournal?jid=gtqa&segmentTxId=29606282&storageInfo=-63%3A8977857..., http://digaudanaqamn3.gt.com:8480/getJournal?jid=gtqa&segmentTxId=29606282&storageInfo=-63%3A8977857...' to transaction ID 29731318
2022-05-01 14:00:50,946 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream 'http://digaudanaqamn2.gt.com:8480/getJournal?jid=gtqa&segmentTxId=29606282&storageInfo=-63%3A8977857...' to transaction ID 29731318
2022-05-01 14:00:51,588 ERROR namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(242)) - Encountered exception on operation MkdirOp [length=0, inodeId=11343504, path=/tmp/hive/nifi/9de63ed4-db2a-4164-b142-2e331cd008e3/hive_2022-04-06_10-25-40_878_569887095825365790-87, timestamp=1649240741042, permissions=nifi:hdfs:rwx------, aclEntries=null, opCode=OP_MKDIR, txid=29731504, xAttrs=[]]
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirForEditLog(FSDirMkdirOp.java:182)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:572)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
2022-05-01 14:00:51,588 ERROR namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(242)) - Encountered exception on operation MkdirOp [length=0, inodeId=11343504, path=/tmp/hive/nifi/9de63ed4-db2a-4164-b142-2e331cd008e3/hive_2022-04-06_10-25-40_878_569887095825365790-87, timestamp=1649240741042, permissions=nifi:hdfs:rwx------, aclEntries=null, opCode=OP_MKDIR, txid=29731504, xAttrs=[]]
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirForEditLog(FSDirMkdirOp.java:182)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:572)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:852)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:707)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
2022-05-01 14:00:51,695 INFO namenode.FSNamesystem (FSNamesystem.java:writeUnlock(1689)) - FSNamesystem write lock held for 2416 ms via
java.lang.Thread.getStackTrace(Thread.java:1556)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1690)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
Number of suppressed write-lock reports: 0
Longest write-lock held interval: 2416
2022-05-01 14:00:51,695 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(726)) - Encountered exception loading fsimage
java.io.IOException: java.lang.IllegalStateException
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:244)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:852)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:707)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:707)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
Caused by: java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirForEditLog(FSDirMkdirOp.java:182)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:572)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
... 12 more
2022-05-01 14:00:51,719 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@digaudanaqamn2.gt.com:50070
2022-05-01 14:00:51,820 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system...
2022-05-01 14:00:51,821 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2022-05-01 14:00:51,824 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped.
2022-05-01 14:00:51,824 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NameNode metrics system shutdown complete.
2022-05-01 14:00:51,824 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: java.lang.IllegalStateException
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:244)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:852)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:707)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
Caused by: java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirForEditLog(FSDirMkdirOp.java:182)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:572)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234)
... 12 more
2022-05-01 14:00:51,826 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
Any leads ??