Dear all, I REcently enabled HA With my namenode.
i started to see issue with my CHECKPOINT process, Means, CHeckPOInt did not occur for past 5 hours.
Here go my observation. Have you seen this case before. or am i hitting any BUG?
Kind share your advice to crack this issue out ...
As per checkpoint process,
When the updated FSIMAGE get downloaded to "NAMENODE" from "STANDBY NAMENODE",
The "FSIMAGE.ckpt_txid" must be renamed to "FSIMAGE_txid" But It's not happening in my case.
I did not see any file named with "FSIMAGE_txid" in my namenode , All are looks like "FSIMAGE.ckpt_txid".
So I just compared both "FSIMAGE.ckpt_txid" & "FSIMAGE_txid" ,Both got same checksum value.
FSIMAGE.ckpt_txid is from NAMENODE
FSIMAGE_txid is from SECONDARYNAMENODE
namenode:
=========
root@namenode:/mnt/sdb/name/current# cksum fsimage.ckpt_0000000000604392126
3708522794 2148716968 fsimage.ckpt_0000000000604392126
secondary-namenode:
================
root@secondary-namenode:/mnt/sdd/name/current# cksum fsimage_0000000000604392126
NOTE: I did not see twork issueany ne, i am able to download the fsimage using "wget" Command.
i am using cdh 4.1.3 & Cloudera Enterprise 4.6.3
Best Regards,
BOMmuraj