Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to enable name node HA without cluster downtime?

Solved Go to solution
Highlighted

How to enable name node HA without cluster downtime?

New Contributor

We have a process which pulls messages from MQ and puts it in HBase. Since the messages have a 10 sec expiry we cannot afford to have the cluster down. What do people do in such situations?

We need enable namenode HA on the hortonworks cluster without taking the cluster offline.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to enable name node HA without cluster downtime?

Unfortunately there is no way to enable HDFS HA without restarting the Namenode.

So unless you can change the process to use a buffer in between. ( Kafka would be a very popular tool combining MQ like use with almost unlimited scalability and easy buffering of dozens to hundreds of Terabyte of data ) . I am not sure what you could do.

So if you really really absolutely cannot lose a tuple or you want to have a safer architecture anyway:

A) Develop a process that reads the events and puts them into kafka. You would also need a process that reads them from kafka again and puts them in hbase.

B) Switch over the process from hbase to kafka

C) Upgrade your cluster

D) Switch on the Kafka->Hbase process. That would not be time critical since even a 3 node Kafka cluster can easily store 10-20TB of data in a replicated fashion.

6 REPLIES 6

Re: How to enable name node HA without cluster downtime?

@S Roy

In this case we need true DR. DR is different from HA.

You can setup an Active-Active site , bring down Active1 , enable HA and during this process Active2 is taking all the load.

WanDisco is a good tool for true DR.

Re: How to enable name node HA without cluster downtime?

New Contributor

We currently do not have a DR site/WAN Disco. Is there any other alternatives?

Re: How to enable name node HA without cluster downtime?

@S Roy

I do have a deployment where we setup HBASE DR using kafka as suggested above. I was under the impression that you are more focused on Cluster HA instead HBASE only.

Apache Falcon is one of my favorites but its more Active-Passive.

Re: How to enable name node HA without cluster downtime?

Unfortunately there is no way to enable HDFS HA without restarting the Namenode.

So unless you can change the process to use a buffer in between. ( Kafka would be a very popular tool combining MQ like use with almost unlimited scalability and easy buffering of dozens to hundreds of Terabyte of data ) . I am not sure what you could do.

So if you really really absolutely cannot lose a tuple or you want to have a safer architecture anyway:

A) Develop a process that reads the events and puts them into kafka. You would also need a process that reads them from kafka again and puts them in hbase.

B) Switch over the process from hbase to kafka

C) Upgrade your cluster

D) Switch on the Kafka->Hbase process. That would not be time critical since even a 3 node Kafka cluster can easily store 10-20TB of data in a replicated fashion.

Re: How to enable name node HA without cluster downtime?

Mentor

I also thought Kafka for this @Benjamin Leonhardi

Re: How to enable name node HA without cluster downtime?

Mentor
Don't have an account?
Coming from Hortonworks? Activate your account here