Support Questions

Find answers, ask questions, and share your expertise

How to enable name node HA without cluster downtime?

avatar
Contributor

We have a process which pulls messages from MQ and puts it in HBase. Since the messages have a 10 sec expiry we cannot afford to have the cluster down. What do people do in such situations?

We need enable namenode HA on the hortonworks cluster without taking the cluster offline.

1 ACCEPTED SOLUTION

avatar
Master Guru

Unfortunately there is no way to enable HDFS HA without restarting the Namenode.

So unless you can change the process to use a buffer in between. ( Kafka would be a very popular tool combining MQ like use with almost unlimited scalability and easy buffering of dozens to hundreds of Terabyte of data ) . I am not sure what you could do.

So if you really really absolutely cannot lose a tuple or you want to have a safer architecture anyway:

A) Develop a process that reads the events and puts them into kafka. You would also need a process that reads them from kafka again and puts them in hbase.

B) Switch over the process from hbase to kafka

C) Upgrade your cluster

D) Switch on the Kafka->Hbase process. That would not be time critical since even a 3 node Kafka cluster can easily store 10-20TB of data in a replicated fashion.

View solution in original post

6 REPLIES 6

avatar
Master Mentor
@S Roy

In this case we need true DR. DR is different from HA.

You can setup an Active-Active site , bring down Active1 , enable HA and during this process Active2 is taking all the load.

WanDisco is a good tool for true DR.

avatar
Contributor

We currently do not have a DR site/WAN Disco. Is there any other alternatives?

avatar
Master Mentor

@S Roy

I do have a deployment where we setup HBASE DR using kafka as suggested above. I was under the impression that you are more focused on Cluster HA instead HBASE only.

Apache Falcon is one of my favorites but its more Active-Passive.

avatar
Master Guru

Unfortunately there is no way to enable HDFS HA without restarting the Namenode.

So unless you can change the process to use a buffer in between. ( Kafka would be a very popular tool combining MQ like use with almost unlimited scalability and easy buffering of dozens to hundreds of Terabyte of data ) . I am not sure what you could do.

So if you really really absolutely cannot lose a tuple or you want to have a safer architecture anyway:

A) Develop a process that reads the events and puts them into kafka. You would also need a process that reads them from kafka again and puts them in hbase.

B) Switch over the process from hbase to kafka

C) Upgrade your cluster

D) Switch on the Kafka->Hbase process. That would not be time critical since even a 3 node Kafka cluster can easily store 10-20TB of data in a replicated fashion.

avatar
Master Mentor

I also thought Kafka for this @Benjamin Leonhardi

avatar
Master Mentor