Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Encountered exception loading fsimage, NameNode is not formatted

avatar
Contributor

NameNode is crashed when trying to restart it via CM due to: 

INFO Lock on /data/disk1/dfs/nn/in_use.lock acquired by nodename 26396@ip-10-2-0-224.ec2.internal
Warning Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:251)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1166)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:757)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:642)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:713)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:931)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1666)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1733)
Failed to start namenode.
java.io.IOException: NameNode is not formatted.
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:251)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1166)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:757)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:642)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:713)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:931)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1666)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1733)

 And when i try to sudo -u hdfs hdfs namenode -format on the nn i get this error:

23/11/30 12:21:35 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Running in secure mode, but config doesn't have a keytab
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:306)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1136)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1733)
23/11/30 12:21:35 INFO util.ExitUtil: Exiting with status 1: java.io.IOException: Running in secure mode, but config doesn't have a keytab
23/11/30 12:21:35 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ip-10-2-0-224.ec2.internal/10.2.0.224

 Please help me get the HDFS NN running

20 REPLIES 20

avatar
Rising Star

Hi @George-Megre ,

 Running a CLI command on Namenode managed by Cloudera manager will be little tricky as you need to set right environment and hadoop config directory path etc. . Let's make it work through CM itself. Please make sure below things are verified.

  1. "/data/disk1/dfs/nn/current/VERSION" file is present with valid content. 
  2. Verify if another old instance of Namenode is already running or not by executing "ps -ef | grep -I namenode" and kill it.

Attach recursive directory listing of /data/disk1/dfs/nn/ along with core-site.xml/hdfs-site.xml if above steps don't help.

avatar
Contributor

Hey @Majeti,

Unfortunately the offered solution didn't help,  VERSION file is present with valid content and there is no Namenode process is running (at all 🤔) in the machine after executing ps command. adding recursive directory listing of /data/disk1/dfs/nn/: 

/data/disk1/dfs$ sudo ls -la nn/current
total 1588
drwx------ 2 root root 4096 Nov 22 10:36 .
drwx------ 3 hdfs hadoop 4096 Dec 11 09:19 ..
-rw-r--r-- 1 root root 191 Nov 22 10:36 VERSION
-rw-r--r-- 1 root root 2894 Nov 22 10:36 edits_0000000000002786579-0000000000002786605
-rw-r--r-- 1 root root 8466 Nov 22 10:36 edits_0000000000002786606-0000000000002786671
-rw-r--r-- 1 root root 8216 Nov 22 10:36 edits_0000000000002786672-0000000000002786743
-rw-r--r-- 1 root root 4767 Nov 22 10:36 edits_0000000000002786744-0000000000002786772
-rw-r--r-- 1 root root 8207 Nov 22 10:36 edits_0000000000002786773-0000000000002786818
-rw-r--r-- 1 root root 1965 Nov 22 10:36 edits_0000000000002786819-0000000000002786837
-rw-r--r-- 1 root root 15365 Nov 22 10:36 edits_0000000000002786838-0000000000002786940
-rw-r--r-- 1 root root 15786 Nov 22 10:36 edits_0000000000002786941-0000000000002787070
-rw-r--r-- 1 root root 2162 Nov 22 10:36 edits_0000000000002787071-0000000000002787088
-rw-r--r-- 1 root root 4940 Nov 22 10:36 edits_0000000000002787089-0000000000002787116
-rw-r--r-- 1 root root 3316 Nov 22 10:36 edits_0000000000002787117-0000000000002787144
-rw-r--r-- 1 root root 7246 Nov 22 10:36 edits_0000000000002787145-0000000000002787187
-rw-r--r-- 1 root root 1646 Nov 22 10:36 edits_0000000000002787188-0000000000002787202
-rw-r--r-- 1 root root 1589 Nov 22 10:36 edits_0000000000002787203-0000000000002787216
-rw-r--r-- 1 root root 4332 Nov 22 10:36 edits_0000000000002787217-0000000000002787239
-rw-r--r-- 1 root root 1587 Nov 22 10:36 edits_0000000000002787240-0000000000002787253
-rw-r--r-- 1 root root 5611 Nov 22 10:36 edits_0000000000002787254-0000000000002787287
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002787288-0000000000002787301
-rw-r--r-- 1 root root 1251 Nov 22 10:36 edits_0000000000002787302-0000000000002787313
-rw-r--r-- 1 root root 4669 Nov 22 10:36 edits_0000000000002787314-0000000000002787338
-rw-r--r-- 1 root root 1587 Nov 22 10:36 edits_0000000000002787339-0000000000002787352
-rw-r--r-- 1 root root 4330 Nov 22 10:36 edits_0000000000002787353-0000000000002787375
-rw-r--r-- 1 root root 816 Nov 22 10:36 edits_0000000000002787376-0000000000002787383
-rw-r--r-- 1 root root 4330 Nov 22 10:36 edits_0000000000002787384-0000000000002787406
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002787407-0000000000002787420
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002787421-0000000000002787434
-rw-r--r-- 1 root root 4331 Nov 22 10:36 edits_0000000000002787435-0000000000002787457
-rw-r--r-- 1 root root 3695 Nov 22 10:36 edits_0000000000002787458-0000000000002787494
-rw-r--r-- 1 root root 13128 Nov 22 10:36 edits_0000000000002787495-0000000000002787575
-rw-r--r-- 1 root root 8425 Nov 22 10:36 edits_0000000000002787576-0000000000002787639
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002787640-0000000000002787653
-rw-r--r-- 1 root root 4333 Nov 22 10:36 edits_0000000000002787654-0000000000002787676
-rw-r--r-- 1 root root 2808 Nov 22 10:36 edits_0000000000002787677-0000000000002787698
-rw-r--r-- 1 root root 4332 Nov 22 10:36 edits_0000000000002787699-0000000000002787721
-rw-r--r-- 1 root root 2858 Nov 22 10:36 edits_0000000000002787722-0000000000002787744
-rw-r--r-- 1 root root 2855 Nov 22 10:36 edits_0000000000002787745-0000000000002787767
-rw-r--r-- 1 root root 10117 Nov 22 10:36 edits_0000000000002787768-0000000000002787846
-rw-r--r-- 1 root root 9826 Nov 22 10:36 edits_0000000000002787847-0000000000002787916
-rw-r--r-- 1 root root 4940 Nov 22 10:36 edits_0000000000002787917-0000000000002787944
-rw-r--r-- 1 root root 1960 Nov 22 10:36 edits_0000000000002787945-0000000000002787961
-rw-r--r-- 1 root root 813 Nov 22 10:36 edits_0000000000002787962-0000000000002787969
-rw-r--r-- 1 root root 4327 Nov 22 10:36 edits_0000000000002787970-0000000000002787992
-rw-r--r-- 1 root root 1644 Nov 22 10:36 edits_0000000000002787993-0000000000002788007
-rw-r--r-- 1 root root 4334 Nov 22 10:36 edits_0000000000002788008-0000000000002788030
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002788031-0000000000002788044
-rw-r--r-- 1 root root 1589 Nov 22 10:36 edits_0000000000002788045-0000000000002788058
-rw-r--r-- 1 root root 4331 Nov 22 10:36 edits_0000000000002788059-0000000000002788081
-rw-r--r-- 1 root root 1589 Nov 22 10:36 edits_0000000000002788082-0000000000002788095
-rw-r--r-- 1 root root 4332 Nov 22 10:36 edits_0000000000002788096-0000000000002788118
-rw-r--r-- 1 root root 1587 Nov 22 10:36 edits_0000000000002788119-0000000000002788132
-rw-r--r-- 1 root root 1587 Nov 22 10:36 edits_0000000000002788133-0000000000002788146
-rw-r--r-- 1 root root 17814 Nov 22 10:36 edits_0000000000002788147-0000000000002788278
-rw-r--r-- 1 root root 1584 Nov 22 10:36 edits_0000000000002788279-0000000000002788292
-rw-r--r-- 1 root root 4333 Nov 22 10:36 edits_0000000000002788293-0000000000002788315
-rw-r--r-- 1 root root 1588 Nov 22 10:36 edits_0000000000002788316-0000000000002788329
-rw-r--r-- 1 root root 1584 Nov 22 10:36 edits_0000000000002788330-0000000000002788343
-rw-r--r-- 1 root root 4390 Nov 22 10:36 edits_0000000000002788344-0000000000002788367
-rw-r--r-- 1 root root 1583 Nov 22 10:36 edits_0000000000002788368-0000000000002788381
-rw-r--r-- 1 root root 9140 Nov 22 10:36 edits_0000000000002788382-0000000000002788428
-rw-r--r-- 1 root root 4664 Nov 22 10:36 edits_0000000000002788429-0000000000002788458
-rw-r--r-- 1 root root 1590 Nov 22 10:36 edits_0000000000002788459-0000000000002788472
-rw-r--r-- 1 root root 4332 Nov 22 10:36 edits_0000000000002788473-0000000000002788495
-rw-r--r-- 1 root root 1761 Nov 22 10:36 edits_0000000000002788496-0000000000002788509
-rw-r--r-- 1 root root 4333 Nov 22 10:36 edits_0000000000002788510-0000000000002788532
-rw-r--r-- 1 root root 1586 Nov 22 10:36 edits_0000000000002788533-0000000000002788546
-rw-r--r-- 1 root root 1589 Nov 22 10:36 edits_0000000000002788547-0000000000002788560
-rw-r--r-- 1 root root 816 Nov 22 10:36 edits_0000000000002788561-0000000000002788568
-rw-r--r-- 1 root root 563666 Nov 22 10:36 fsimage_0000000000005559643
-rw-r--r-- 1 root root 62 Nov 22 10:36 fsimage_0000000000005559643.md5
-rw-r--r-- 1 root root 563666 Nov 22 10:36 fsimage_0000000000005560209
-rw-r--r-- 1 root root 62 Nov 22 10:36 fsimage_0000000000005560209.md5
-rw-r--r-- 1 root root 8 Nov 22 10:36 seen_txid

 and  hdfs-site.xml:

cat /etc/hadoop/conf.cloudera.hdfs/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>dfs.nameservices</name>
<value>hdfs-cdp7</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hdfs-cdp7</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.hdfs-cdp7</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>ip-10-2-0-115.ec2.internal:2181,ip-10-2-0-171.ec2.internal:2181,ip-10-2-0-177.ec2.internal:2181</value>
</property>
<property>
<name>dfs.ha.namenodes.hdfs-cdp7</name>
<value>namenode1546334668,namenode1546333642</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfs-cdp7.namenode1546334668</name>
<value>ip-10-2-0-224.ec2.internal:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.hdfs-cdp7.namenode1546334668</name>
<value>ip-10-2-0-224.ec2.internal:8022</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cdp7.namenode1546334668</name>
<value>ip-10-2-0-224.ec2.internal:9870</value>
</property>
<property>
<name>dfs.namenode.https-address.hdfs-cdp7.namenode1546334668</name>
<value>ip-10-2-0-224.ec2.internal:9871</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hdfs-cdp7.namenode1546333642</name>
<value>ip-10-2-0-49.ec2.internal:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.hdfs-cdp7.namenode1546333642</name>
<value>ip-10-2-0-49.ec2.internal:8022</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cdp7.namenode1546333642</name>
<value>ip-10-2-0-49.ec2.internal:9870</value>
</property>
<property>
<name>dfs.namenode.https-address.hdfs-cdp7.namenode1546333642</name>
<value>ip-10-2-0-49.ec2.internal:9871</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>fs.permissions.umask-mode</name>
<value>022</value>
</property>
<property>
<name>dfs.client.block.write.locateFollowingBlock.retries</name>
<value>7</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.algorithm</name>
<value>3des</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.suites</name>
<value>AES/CTR/NoPadding</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>
<value>256</value>
</property>
<property>
<name>dfs.namenode.acls.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.streams.cache.size</name>
<value>4096</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hdfs-sockets/dn</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.skip.checksum</name>
<value>false</value>
</property>
<property>
<name>dfs.client.domain.socket.data.traffic</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/_HOST@CDP7.COM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/_HOST@CDP7.COM</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/_HOST@CDP7.COM</value>
</property>
</configuration>

 core-site.xml:

cat /etc/hadoop/conf.cloudera.hdfs/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdfs-cdp7</value>
</property>
<property>
<name>ipc.client.connection.maxidletime</name>
<value>30000</value>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>50</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.rpc.protection</name>
<value>authentication</value>
</property>
<property>
<name>hadoop.ssl.require.client.cert</name>
<value>false</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.keystores.factory.class</name>
<value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.server.conf</name>
<value>ssl-server.xml</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.client.conf</name>
<value>ssl-client.xml</value>
<final>true</final>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>RULE:[2:$1@$0](rangeradmin@CDP7.COM)s/(.*)@CDP7.COM/ranger/
RULE:[2:$1@$0](rangertagsync@CDP7.COM)s/(.*)@CDP7.COM/rangertagsync/
RULE:[2:$1@$0](rangerusersync@CDP7.COM)s/(.*)@CDP7.COM/rangerusersync/
RULE:[1:$1@$0](HTTP.*@CDP7.COM$)s/@CDP7.COM$//
RULE:[1:$1@$0](.*@CDP7.COM$)s/@CDP7.COM$///L
RULE:[2:$1@$0](HTTP.*@CDP7.COM$)s/@CDP7.COM$//
RULE:[2:$1@$0](.*@CDP7.COM$)s/@CDP7.COM$///L
DEFAULT</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.HTTP.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.HTTP.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.knox.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.knox.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.livy.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.livy.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.impala.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.impala.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfs.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.yarn.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.yarn.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.phoenix.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.phoenix.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.security.group.mapping</name>
<value>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</value>
</property>
<property>
<name>hadoop.security.instrumentation.requires.admin</name>
<value>false</value>
</property>
</configuration>

 Thanks!

avatar
Rising Star

Hi @George-Megre ,

 Looks like the ownership of the /data/disk1/dfs/nn/ is showing correct as hdfs:hadoop but /data/disk1/dfs/nn/current onwards shows as root:root. That might be cause as well. Please run the command "chown -R hdfs:hdfs /data/disk1/dfs/nn/current" and try to start the Namenode process.

avatar
Contributor

Hey @Majeti ,

After executing the 'chown' command on the current directory, the previous error disappeared; however, a new error has emerged

Encountered exception loading fsimage
java.io.IOException: There appears to be a gap in the edit log.  We expected txid 5560210, but got txid 5687613.
	at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:95)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:268)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:182)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:912)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:760)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1166)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:757)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:642)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:713)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:956)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:931)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1666)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1733)

 Thanks!

avatar
Contributor

Hello @Majeti ,

we urgently need your assistance. Our environment is currently down.

Could you please respond at your earliest convenience? Thank you! 🙏

avatar
Rising Star

Sorry @George-Megre , I was on vacation. I am back now. If you can provide me the listing of both NN data dir and JN directories, I can review edit logs and suggest you quickly.

avatar
Contributor

Hey @Majeti ,

thanks for your comment, regrading listing of /data/disk1/dfs/nn/ you can find it above in my comments.

here is  listing of /data/disk1/dfs/jn:

/data/disk1/dfs$ sudo ls jn/hdfs-cdp7/current
-rw-r--r-- 1 hdfs hdfs 7793 Jun 12 2022 edits_0000000000005559461-0000000000005559499
-rw-r--r-- 1 hdfs hdfs 1583 Jun 12 2022 edits_0000000000005559500-0000000000005559513
-rw-r--r-- 1 hdfs hdfs 2067 Jun 12 2022 edits_0000000000005559514-0000000000005559530
-rw-r--r-- 1 hdfs hdfs 4332 Jun 12 2022 edits_0000000000005559531-0000000000005559553
-rw-r--r-- 1 hdfs hdfs 4664 Jun 12 2022 edits_0000000000005559554-0000000000005559583
-rw-r--r-- 1 hdfs hdfs 1776 Jun 12 2022 edits_0000000000005559584-0000000000005559599
-rw-r--r-- 1 hdfs hdfs 4941 Jun 12 2022 edits_0000000000005559600-0000000000005559627
-rw-r--r-- 1 hdfs hdfs 1919 Jun 12 2022 edits_0000000000005559628-0000000000005559643
-rw-r--r-- 1 hdfs hdfs 4329 Jun 12 2022 edits_0000000000005559644-0000000000005559666
-rw-r--r-- 1 hdfs hdfs 1589 Jun 12 2022 edits_0000000000005559667-0000000000005559680
-rw-r--r-- 1 hdfs hdfs 815 Jun 12 2022 edits_0000000000005559681-0000000000005559688
-rw-r--r-- 1 hdfs hdfs 5106 Jun 12 2022 edits_0000000000005559689-0000000000005559717
-rw-r--r-- 1 hdfs hdfs 816 Jun 12 2022 edits_0000000000005559718-0000000000005559725
-rw-r--r-- 1 hdfs hdfs 4331 Jun 12 2022 edits_0000000000005559726-0000000000005559748
-rw-r--r-- 1 hdfs hdfs 1590 Jun 12 2022 edits_0000000000005559749-0000000000005559762
-rw-r--r-- 1 hdfs hdfs 1588 Jun 12 2022 edits_0000000000005559763-0000000000005559776
-rw-r--r-- 1 hdfs hdfs 4332 Jun 12 2022 edits_0000000000005559777-0000000000005559799
-rw-r--r-- 1 hdfs hdfs 1585 Jun 12 2022 edits_0000000000005559800-0000000000005559813
-rw-r--r-- 1 hdfs hdfs 4331 Jun 12 2022 edits_0000000000005559814-0000000000005559836
-rw-r--r-- 1 hdfs hdfs 1582 Jun 12 2022 edits_0000000000005559837-0000000000005559850
-rw-r--r-- 1 hdfs hdfs 1589 Jun 12 2022 edits_0000000000005559851-0000000000005559864
-rw-r--r-- 1 hdfs hdfs 4332 Jun 12 2022 edits_0000000000005559865-0000000000005559887
-rw-r--r-- 1 hdfs hdfs 1589 Jun 12 2022 edits_0000000000005559888-0000000000005559901
-rw-r--r-- 1 hdfs hdfs 4333 Jun 12 2022 edits_0000000000005559902-0000000000005559924
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005559925-0000000000005559938
-rw-r--r-- 1 hdfs hdfs 1588 Jun 12 2022 edits_0000000000005559939-0000000000005559952
-rw-r--r-- 1 hdfs hdfs 4332 Jun 12 2022 edits_0000000000005559953-0000000000005559975
-rw-r--r-- 1 hdfs hdfs 1584 Jun 12 2022 edits_0000000000005559976-0000000000005559989
-rw-r--r-- 1 hdfs hdfs 4187 Jun 12 2022 edits_0000000000005559990-0000000000005560012
-rw-r--r-- 1 hdfs hdfs 1589 Jun 12 2022 edits_0000000000005560013-0000000000005560026
-rw-r--r-- 1 hdfs hdfs 5052 Jun 12 2022 edits_0000000000005560027-0000000000005560056
-rw-r--r-- 1 hdfs hdfs 4811 Jun 12 2022 edits_0000000000005560057-0000000000005560082
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005560083-0000000000005560096
-rw-r--r-- 1 hdfs hdfs 7407 Jun 12 2022 edits_0000000000005560097-0000000000005560135
-rw-r--r-- 1 hdfs hdfs 1585 Jun 12 2022 edits_0000000000005560136-0000000000005560149
-rw-r--r-- 1 hdfs hdfs 1776 Jun 12 2022 edits_0000000000005560150-0000000000005560165
-rw-r--r-- 1 hdfs hdfs 5272 Jun 12 2022 edits_0000000000005560166-0000000000005560195
-rw-r--r-- 1 hdfs hdfs 1586 Jun 12 2022 edits_0000000000005560196-0000000000005560209
-rw-r--r-- 1 hdfs hdfs 4331 Jun 12 2022 edits_0000000000005560210-0000000000005560232
-rw-r--r-- 1 hdfs hdfs 1588 Jun 12 2022 edits_0000000000005560233-0000000000005560246
-rw-r--r-- 1 hdfs hdfs 1590 Jun 12 2022 edits_0000000000005560247-0000000000005560260
-rw-r--r-- 1 hdfs hdfs 4330 Jun 12 2022 edits_0000000000005560261-0000000000005560283
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005560284-0000000000005560297
-rw-r--r-- 1 hdfs hdfs 4334 Jun 12 2022 edits_0000000000005560298-0000000000005560320
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005560321-0000000000005560334
-rw-r--r-- 1 hdfs hdfs 1589 Jun 12 2022 edits_0000000000005560335-0000000000005560348
-rw-r--r-- 1 hdfs hdfs 4332 Jun 12 2022 edits_0000000000005560349-0000000000005560371
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005560372-0000000000005560385
-rw-r--r-- 1 hdfs hdfs 4330 Jun 12 2022 edits_0000000000005560386-0000000000005560408
-rw-r--r-- 1 hdfs hdfs 1588 Jun 12 2022 edits_0000000000005560409-0000000000005560422
-rw-r--r-- 1 hdfs hdfs 814 Jun 12 2022 edits_0000000000005560423-0000000000005560430
-rw-r--r-- 1 hdfs hdfs 4328 Jun 12 2022 edits_0000000000005560431-0000000000005560453
-rw-r--r-- 1 hdfs hdfs 1587 Jun 12 2022 edits_0000000000005560454-0000000000005560467
-rw-r--r-- 1 hdfs hdfs 4329 Jun 12 2022 edits_0000000000005560468-0000000000005560490
-rw-r--r-- 1 hdfs hdfs 1048576 Jul 5 2022 edits_0000000000005880343-0000000000005880373
-rw-r--r-- 1 hdfs hdfs 1048576 Jul 5 2022 edits_0000000000005880374-0000000000005880666
-rw-r--r-- 1 hdfs hdfs 1048576 Jul 5 2022 edits_0000000000005880667-0000000000005880970
-rw-r--r-- 1 hdfs hdfs 1048576 Jul 5 2022 edits_0000000000005880971-0000000000005881003
-rw-r--r-- 1 hdfs hdfs 77594624 Aug 11 2022 edits_0000000000005881004-0000000000006333331
-rw-r--r-- 1 hdfs hdfs 6291456 Aug 14 2022 edits_0000000000006333332-0000000000006368408
-rw-r--r-- 1 hdfs hdfs 99919013 Sep 8 2022 edits_0000000000006368409-0000000000006868412
-rw-r--r-- 1 hdfs hdfs 76350515 Oct 22 2022 edits_0000000000006868413-0000000000007368417
-rw-r--r-- 1 hdfs hdfs 67517391 Nov 2 2022 edits_0000000000007368418-0000000000007868427
-rw-r--r-- 1 hdfs hdfs 65267739 Dec 15 2022 edits_0000000000007868428-0000000000008369431
-rw-r--r-- 1 hdfs hdfs 64536361 Feb 7 2023 edits_0000000000008369432-0000000000008869452
-rw-r--r-- 1 hdfs hdfs 58720256 Apr 2 2023 edits_0000000000008869453-0000000000009321288
-rw-r--r-- 1 hdfs hdfs 64295590 Jul 7 21:15 edits_0000000000009321289-0000000000009821291
-rw-r--r-- 1 hdfs hdfs 4194304 Jul 11 15:17 edits_0000000000009821292-0000000000009852912
-rw-r--r-- 1 hdfs hdfs 70512684 Aug 29 03:24 edits_0000000000009852913-0000000000010352938
-rw-r--r-- 1 hdfs hdfs 66622409 Oct 23 07:09 edits_0000000000010352939-0000000000010852967
-rw-r--r-- 1 hdfs hdfs 28311552 Nov 14 15:42 edits_0000000000010852968-0000000000011059775
-rw-r--r-- 1 hdfs hdfs 14680064 Nov 22 11:53 edits_0000000000011059776-0000000000011140505
-rw-r--r-- 1 hdfs hdfs 1048576 Nov 22 11:59 edits_0000000000011140506-0000000000011140559
-rw-r--r-- 1 hdfs hdfs 1048576 Nov 22 12:01 edits_0000000000011140560-0000000000011140589
-rw-r--r-- 1 hdfs hdfs 3145728 Nov 23 16:17 edits_0000000000011140590-0000000000011153933
-rw-r--r-- 1 hdfs hdfs 11534336 Nov 29 09:32 edits_0000000000011153934-0000000000011218580
-rw-r--r-- 1 hdfs hdfs 1048576 Nov 29 10:21 edits_0000000000011218581-0000000000011219009
-rw-r--r-- 1 hdfs hdfs 1048576 Nov 29 12:00 edits_0000000000011219010-0000000000011219781
-rw-r--r-- 1 hdfs hdfs 23068672 Dec 11 09:13 edits_0000000000011219782-0000000000011353482
-rw-r--r-- 1 hdfs hdfs 1048576 Dec 11 09:17 edits_0000000000011353483-0000000000011353524
-rw-r--r-- 1 hdfs hdfs 1048576 Jun 12 2022 edits_inprogress_0000000000005560491
-rw-r--r-- 1 hdfs hdfs 17825792 Dec 20 12:21 edits_inprogress_0000000000011353525
-rw-r--r-- 1 hdfs hdfs 3 Dec 11 09:19 last-promised-epoch
-rw-r--r-- 1 hdfs hdfs 3 Dec 11 09:19 last-writer-epoch
drwxr-xr-x 2 hdfs hdfs 4096 Dec 11 09:19 paxos

avatar
Rising Star

Hi @George-Megre , Looks like there is lot of mismatch . I could see below edit file being updated by other Namenode or some other Namenode. 

-rw-r--r-- 1 hdfs hdfs 17825792 Dec 20 12:21 edits_inprogress_0000000000011353525

Could you please verify and make sure no other NN is using these JNs . I would request you to attach both Namenodes' data directories along with all 3 JN's data directories.

avatar
Contributor

Hey @Majeti ,

We have only 1 JN directory:

/data/disk1/dfs$ ls
jn nn

which it's content was sent above