Created 03-05-2016 06:14 PM
I am trying to determine and plan the best disk layout for my active/standby NNs for a new production rollout that is going to run with QJM NN HA. I plan to have three servers, each running an instance of ZK and JN. The plan, per recommendations against another question I asked on this forum, is to have a dedicated RAID-1 disk on each server for the JNs to use for edit logs. I expect that array to use 256-512 GB sized disks. Each server will also have a dedicated disk for the OS, logs, tmp, etc, also RAID-1 using two 1.2TB drives. Each ZK instance will also have dedicated disks (spindles) per recommendations here.
I am having a hard time answering this question...
Where to store the fsimage files? Could I, for example, store them on the same RAID-1 disk that the JNs are using? I do plan to collocate two of the three JNs on the same two servers running the NNs and the third JN on a third server so, the NN would have access to the same drives used by the JNs. This collocation seems to be a commonly recommend arrangement. Or, should the fsimage files be pointed to a separate RAID-1 disk array just for that purpose? Another option would be to point the fsimage files to a separately size partition on the OS RAID-1 disk array.
These questions do NOT come from a sizing perspective, but more from a workload perspective. Fitting the files somewhere is easy to figure out. The real question is about performance impacts of mixing, for example, the background checkpointing operations done by the standby namenode with the work being done by the JNs to save edits and putting all that onto the same spindle. I see clearly that ZK should be kept on it's own spindle due to how it using a write ahead log and how latency is a huge concern in that case to impacting ZK performance. I just don't have a good feel for mixing the two work loads of checkpointing and edit log updates. Can someone please make some recommendation here?
Created 03-08-2016 10:55 PM
First, spot-on by letting the ZK processes write to their own disks. As for letting the active/passive NNs write to the same physical disks as the JNs, I think you are OK with that approach. I say that as the edits are what are being written to continuously, but the fsimage files are only being read/recreated at key points such as checkpointing and startup.
I probably pitched a bit of overkill in a blog I did last year on this topic of filesystems, but feel free to check it out at https://martin.atlassian.net/wiki/x/EoC3Ag if you need some help going to sleep at night. 😉
If you do check it out, you'll notice my very clear advice is that you should still make backups of the fsimage/edits files (even w/HA enabled) to avoid a potential "bunker scene" of your own. Having seen what happens first hand by losing this information (it was a configuration screw-up, not a h/w failure), I know I simply don't want to be there again.
Created 03-08-2016 10:55 PM
First, spot-on by letting the ZK processes write to their own disks. As for letting the active/passive NNs write to the same physical disks as the JNs, I think you are OK with that approach. I say that as the edits are what are being written to continuously, but the fsimage files are only being read/recreated at key points such as checkpointing and startup.
I probably pitched a bit of overkill in a blog I did last year on this topic of filesystems, but feel free to check it out at https://martin.atlassian.net/wiki/x/EoC3Ag if you need some help going to sleep at night. 😉
If you do check it out, you'll notice my very clear advice is that you should still make backups of the fsimage/edits files (even w/HA enabled) to avoid a potential "bunker scene" of your own. Having seen what happens first hand by losing this information (it was a configuration screw-up, not a h/w failure), I know I simply don't want to be there again.