Currenty I am wondering can we move the Data of namenode to MYSQL DB running on the same machine?
Becasuee what I uderstand is there is mainly one kid of HA solution where you go for minimum 2 name node with One being ACTIVE and the other being PASSIVE and the datab is being written through JOURNAL.
But one of my friend's solutio architect has prescribed a solution where they have 3 name node. There was no mention of JOURNAL nodes or the JOURNAL process and without JOURNAL deamon I am not sure whether HA can be achieved or not.
Also they running MYSQL DB on the same server on which name node is running. But from my understanding this mysql should be for storing the information of cloudera and other hadoop ecosystem.
As Namenode data is in memory and it writes the information in files (fsimage and editlog).
Can you please clear this thing.
Inputs will be highly appreciated.
Never heard of that.
I would guess the architect didn't bother placing them explicitely or has just forgotten them.
It always recommened to use odd number when ever you configure any high availabilty .
The pre-requistes for High availability is to have zookeeper server , failover controller , journal nodes.
Since there is no secondary namenode to perform checkpointing , the standy namenode will perform the checkpointing. Suppose one of the namenode dies by the help of failover controller using any fencing method the standy by will be active and will take care of the operations . Both the Active namenode and Stand by namenode will have a shared directory for edits.
Hope this helps
Refer - why odd numbers
Yes exactly. Because I had never came across any solution where namenode data is being stored in any database. Since all the data is in memory and the only way to retrieve DATA of namenode(Metadata of hadoop cluster) is through fsimage and edit logs (both are OS files) which in case of HA are being written through JOURNAL which are supposed to be in ODD number.
I was also wondering how is it possible. So thought Either I had skipped some of the new developments in CLOUDERA or there is some confusion i the other side :)
Its not just namenode you can also Configure HA for resource manager .
Hope your doubts are cleared mate.
Yes but for implementing HA in Hadoop we have to configure Zookeeper and the process is same, means it goes with Journal concept of writing in maximum journal nodes correct?