Created on 11-13-2019 05:16 AM - last edited on 11-13-2019 05:54 AM by cjervis
we have two namenode machines ( are part of HDP cluster in ambari )
because electricity failure , we notices about the following
on one name node we see that fsimage_xxxx files are missing
while on the second namenode they are exists
is it possible to re-create them on the faulty name node
example on the bad node
ls /hadoop/hdfs/namenode/current | grep fsimage_
no output
on the good namenode
ls /hadoop/hdfs/namenode/current | grep fsimage_
fsimage_0000000000044556627
fsimage_0000000000044556627.md5
fsimage_0000000000044577059
fsimage_0000000000044577059.md5
the status for now is that name-node service not startup successfully from ambari
and the logs from the faulty name-node say like this:
ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.FileNotFoundException: No valid image files found
Created 11-13-2019 12:08 PM
Are all the namenodes non-funtional? If you have one healthy namenode then there is a way out.
Please revert.
Created 11-13-2019 01:06 PM
hi
the second name node have the fsimage files ,
but from ambari the second namenode not appears as standby/active is just up
Created 11-13-2019 01:23 PM
Yeah, but once you bootstrap the zookeeper election will kick in and one will become the active namenode.
It's late here I need to document the process though I have uploaded it once in HCC, I need to redact some information, I could do that tomorrow meanwhile can you backup on both the dead and working namenode the following directory zip all content and copy it to some safe directory
/hadoop/hdfs/journal/<Ckuster_name>/current
Please revert
Created 11-13-2019 01:47 PM
sure we can backup both folders in tar file
but what is your suggestion on the node without the fsimage files?
Created 11-13-2019 01:59 PM
Guess what the copies on the working node should do ! remember files on both nodes directories are identical in an HA setup 🙂
Cheers
Created on 11-13-2019 02:03 PM - edited 11-13-2019 02:04 PM
you means the files /hadoop/hdfs/namenode/current/fsimage_* should be the same on both nodes exactly ?
in this case its easy to copy from the good namenode to the bad namenode and restart the bad namenode
let me know if this is the procedure?
Created 11-14-2019 10:09 AM
In an HA Cluster, the Standby and Active namenodes have shared storage managed by the journal node service. HA relies on a failover scenario to swap from StandBy to Active Namenode and as any other system in Hadoop it uses zookeepers.
So first thing 3 Zookeepers 3 MUST be online to avoid split-brain-decision, below are the steps to follow
On the Active Namenode
Run the cat commande against the last-promised-epoch in the same directory as edits_inprogress_000....
# cat last-promised-epoch
31 [example output]
On the standby Namenode
# cat last-promised-epoch
23 [example output]
From the above, you will see that the standby had a lag when the power went off. In your case, you should overwrite the one lagging on the standby after backing up as you already did hoping the Namenode has not been put back online if so do a fresh back before you proceed.
SOLUTION
Fix the corrupted JN's edits?
Instructions to fix that one journal node.
1) Put both NN in safemode ( NN HA)
$ hdfs dfsadmin -safemode enter
-------output------
Safe mode is ON in Namenode1:8020
Safe mode is ON in Namenode2:8020
2) Save Namespace
$ hdfs dfsadmin -saveNamespace
-------output------
Save namespace successful for Namenode1:8020
Save namespace successful for Namenode2:8020
3) zip / tar the journal dir from a working JN node and copy it to the non-working JN node to failed node in the same path as the active make sure the file permissions are correct
/hadoop/hdfs/journal/<cluster_name>/current
4) Restart HDFS
In your case you can start only one Namenode first it will be designated automatically as the active namenode, once it up and running that is fine, the NameNode failover should now occur transparently and the below alerts should gradually disappear
Stop and restart the journal nodes
This will trigger the syncing of the journalnodes, If you wait for a while you should see your Namenodes up and running all "green"
# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh stop journalnode"
# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode"
Start the standby name node
After a while, things should be in order
Please let me know
Created on 11-14-2019 01:01 PM - edited 11-14-2019 01:16 PM
hi
lets say on the faulty namenode we see only the following:
under - /hadoop/hdfs/journal/hdfsha/current
example:
-rw-r--r-- 1 hdfs hadoop 155 Nov 13 00:50 VERSION
-rw-r--r-- 1 hdfs hadoop 3 Nov 14 21:24 last-promised-epoch
drwxr-xr-x 2 hdfs hadoop 4096 Nov 14 21:24 paxos
-rw-r--r-- 1 hdfs hadoop 3 Nov 14 21:24 last-writer-epoch
-rw-r--r-- 1 hdfs hadoop 8 Nov 14 23:16 committed-txid
as you know usually we have also the files like -
edits_0000000000000066195-0000000000000066242
is this scenario change the picture?
Created 11-27-2019 05:32 AM
Sorry I just saw your response while checking my backlog you didn't tag me with @ that expalins why I wasn't notified.
That's weird in fact the files you backed should be copied to this faulty node, which is the source of the problems. So thing/process MUST have deleted those files
Then follow the steps I laid out .