Created 08-31-2016 10:38 AM
Hi All,
I'm interested by any paper or returns of experience regarding the Hadoop cluster fail over design.
From my understanding 2 mains components concerned are NameNode and ResourceManager.
Thanks
Regards
Farhad
Created 09-01-2016 02:51 AM
There are a number of components in Hadoop and its ecosystem. Each of them have their own high availability/failover strategy and different implications in case of a failure. You already mentioned Namenode and YARN Resource Manager. But there are others like HiveServer2, Hive Metastore, HMaster if you are using HBase and other components. Each have it's own documentation available on Hortonworks website.
1. Namenode also known as Master node is the linchpin of Hadoop. If namenode fails, your cluster is officially lost. To avoid this scenario, you must configure standby namenode. Instructions to setup Namenode HA can be found here.
2. YARN Resource Manager. YARN manages your cluster resources. Basically what job/application should get how much memory/cpu resources is allocated using YARN. So that's pretty important. While YARN has concept of Application Master, Node Manager and Container but what is really a single point of failure is YARN resource manager. So you need HA for Resource Manager. Check these two links. link 1 and link 2.
3. Hive Server2. What if you are using Hive (SQL) to query structured data in Hadoop? Assume you have multiple concurrent jobs running or adhoc users running their queries and they connect to Hive using HiveServer2. What if HiveServer2 goes down? Well, you need redundancy for that. Here is how you do it.
4. Are you using HBase? HBase has a component called HMaster. While not quite as crucial as HiveServer2 or Resource Manager but if HMaster goes down, you might see an impact specially if a region server also goes down before you are able to bring HMaster up. So you need to setup HA for HMaster. Check this link.
I hope this helps. If you have any followup question, please feel free to ask.
Created 09-01-2016 02:51 AM
There are a number of components in Hadoop and its ecosystem. Each of them have their own high availability/failover strategy and different implications in case of a failure. You already mentioned Namenode and YARN Resource Manager. But there are others like HiveServer2, Hive Metastore, HMaster if you are using HBase and other components. Each have it's own documentation available on Hortonworks website.
1. Namenode also known as Master node is the linchpin of Hadoop. If namenode fails, your cluster is officially lost. To avoid this scenario, you must configure standby namenode. Instructions to setup Namenode HA can be found here.
2. YARN Resource Manager. YARN manages your cluster resources. Basically what job/application should get how much memory/cpu resources is allocated using YARN. So that's pretty important. While YARN has concept of Application Master, Node Manager and Container but what is really a single point of failure is YARN resource manager. So you need HA for Resource Manager. Check these two links. link 1 and link 2.
3. Hive Server2. What if you are using Hive (SQL) to query structured data in Hadoop? Assume you have multiple concurrent jobs running or adhoc users running their queries and they connect to Hive using HiveServer2. What if HiveServer2 goes down? Well, you need redundancy for that. Here is how you do it.
4. Are you using HBase? HBase has a component called HMaster. While not quite as crucial as HiveServer2 or Resource Manager but if HMaster goes down, you might see an impact specially if a region server also goes down before you are able to bring HMaster up. So you need to setup HA for HMaster. Check this link.
I hope this helps. If you have any followup question, please feel free to ask.
Created 01-05-2018 10:53 AM
Hi,
I have some questions about the Hadoop Cluster data node failover:
Also, another question is about the Hadoop cluster hardware configuration. Let's say we will use our Hadoop cluster to process 100GB log files each day, how many data nodes do we need to set up? And for each data node hardware configuration(e.g. CPU, RAM, Harddisk)?
Thank You
Hari
Created 01-05-2018 01:46 PM