Support Questions

Find answers, ask questions, and share your expertise

how to re-create the fsimage_xxxxxxx files on namenode

avatar

we have two namenode machines ( are part of HDP cluster in ambari )

because electricity failure , we notices about the following

on one name node we see that fsimage_xxxx files are missing
while on the second namenode they are exists

is it possible to re-create them on the faulty name node


example on the bad node

ls /hadoop/hdfs/namenode/current | grep fsimage_

no output


on the good namenode

ls /hadoop/hdfs/namenode/current | grep fsimage_
fsimage_0000000000044556627
fsimage_0000000000044556627.md5
fsimage_0000000000044577059
fsimage_0000000000044577059.md5


the status for now is that name-node service not startup successfully from ambari

and the logs from the faulty name-node say like this:

 

ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.FileNotFoundException: No valid image files found

Michael-Bronson
10 REPLIES 10

avatar
Master Mentor

@mike_bronson7 

Are all the namenodes non-funtional?  If you have one healthy namenode then there is a way out.

 

Please revert.

avatar

hi

the second name node have the fsimage files , 

but from ambari the second namenode not appears as standby/active is just up 

Michael-Bronson

avatar
Master Mentor

@mike_bronson7 

 

Yeah, but once you bootstrap the zookeeper election will kick in and one will become the active namenode.

 

It's late here I need to document the process though  I have uploaded it once in HCC, I need to redact  some information, I could do that tomorrow meanwhile can you backup on both the dead and working namenode the following directory zip all content and copy it to some safe directory 

 

/hadoop/hdfs/journal/<Ckuster_name>/current

 

Please revert

avatar

sure we can backup both folders in tar file

but what is your suggestion on the node without the fsimage files?

Michael-Bronson

avatar
Master Mentor

@mike_bronson7 

Guess what the copies on the working node should do ! remember files on both nodes directories are identical in an HA setup 🙂 

Cheers

avatar

you means the files /hadoop/hdfs/namenode/current/fsimage_* should be the same on both nodes exactly ? 

 

in this case its easy to copy from the good namenode to the bad namenode and restart the bad namenode

let me know if this is the procedure?

Michael-Bronson

avatar
Master Mentor

@mike_bronson7 

 

In an HA Cluster, the Standby and Active namenodes have shared storage managed by the journal node service. HA relies on a failover scenario to swap from StandBy to Active Namenode and as any other system in Hadoop it uses zookeepers.

So first thing 3 Zookeepers 3 MUST be online to avoid split-brain-decision, below are the steps to follow


On the Active Namenode

Run the cat commande against the last-promised-epoch in the same directory as edits_inprogress_000....
# cat last-promised-epoch
31 [example output]


On the standby Namenode

# cat last-promised-epoch
23 [example output]

From the above, you will see that the standby had a lag when the power went off. In your case, you should overwrite the one lagging on the standby after backing up as you already did hoping the Namenode has not been put back online if so do a fresh back before you proceed.

 

SOLUTION

Fix the corrupted JN's edits?

Instructions to fix that one journal node.
1) Put both NN in safemode ( NN HA)

$ hdfs dfsadmin -safemode enter

-------output------
Safe mode is ON in Namenode1:8020
Safe mode is ON in Namenode2:8020

2) Save Namespace

$ hdfs dfsadmin -saveNamespace

-------output------
Save namespace successful for Namenode1:8020
Save namespace successful for Namenode2:8020

3) zip / tar the journal dir from a working JN node and copy it to the non-working JN node to failed node in the same path as the active make sure the file permissions are correct

/hadoop/hdfs/journal/<cluster_name>/current

4) Restart HDFS

In your case you can start only one Namenode first it will be designated  automatically as the active namenode, once it up and running that is fine, the NameNode failover should now occur transparently and the below alerts should gradually disappear

 

Stop and restart the journal nodes

This will trigger the syncing of the journalnodes, If you wait for a while you should see your Namenodes up and running all "green"

# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh stop journalnode"

# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode"

 

Start the  standby name node

 

After a while, things should be in order

 

Please let me know

avatar

 

hi

 

lets say on the faulty namenode we see only the following:

 

under - /hadoop/hdfs/journal/hdfsha/current

 

example:


-rw-r--r-- 1 hdfs hadoop 155 Nov 13 00:50 VERSION
-rw-r--r-- 1 hdfs hadoop 3 Nov 14 21:24 last-promised-epoch
drwxr-xr-x 2 hdfs hadoop 4096 Nov 14 21:24 paxos
-rw-r--r-- 1 hdfs hadoop 3 Nov 14 21:24 last-writer-epoch
-rw-r--r-- 1 hdfs hadoop 8 Nov 14 23:16 committed-txid

 

as you know usually we have also the files like - 

edits_0000000000000066195-0000000000000066242

 

is this scenario change the picture? 

 

Michael-Bronson

avatar
Master Mentor

@mike_bronson7 

Sorry I just saw your response while checking my backlog you didn't tag me with @  that expalins why I wasn't notified.

That's weird in fact the files you backed should be copied to this faulty node, which is the source of the problems. So thing/process MUST have deleted those files

 

Then follow the steps I laid out .