Support Questions

Find answers, ask questions, and share your expertise

New node addition in existing cluster with HA

Hello there,

I am fairly new to Hadoop and Ambari so please excuse if you feel some of the queries are absurd or illogical. TIA 🙂


We have a HA enabled Ambari cluster where there are 2 masters(with ambari-server and ambari-agent) and 1 worker(with ambari-agent) installed on them. The setup is being created using the Ansible GIT project for with few modifications. We are able to get HA for ambari-server and few components like zookeeper, etc.

The requirement for which I am seeking help from community is to have a case where if one of the master in HA cluster goes down, there is a need to add a new node to existing cluster that will join as a standyby master node.

IMO with limited exposure to ambari, I feel that all the components in HA will need to be re-configured to add this new node. Is there a better way to do this or any way at all to achieve this?

Can blueprints help in this case? If I set a host_mapping.json file and inside it, hostgroup could be of 'master' type and hence Ambari itself could figure out which components to configure and where to setup HA?

Thanks for being patient for reading so much information. Please feel free to comment on this and any pointers will be really helpful.



@Amit Bhardwaj

Firstly there is nothing wrong with asking we all started from the same point 🙂 Having said that I would like some clarifications. To my understanding you have 3 nodes:

- HA [Namenode/RM]??? is also running the Ambari server/ambari-agent right? 
- One data node (Worker node) 
- Other HDP components how many zookeeper instances do you have?

If the above assumptions are correct there are a few corrections to make to your configuration depending on the intended use of the cluster[POC,DEV,TEST or PROD] I would eliminate the first 2 as one really doesn't need an HA in POC or DEV.

The least acceptable configuration for TEST leave alone PROD would be

- At least an edge node - At least 2 Masters for [Zookeeper/Namenode and YARN  HA] MUST has 3 zookeeper servers 
- At least 3 Datanodes aka Worker node with a default replication factor of 3

A blueprint is the best solution if you have limited exposure to Ambari which is to be very intuitive tool adding a data node is a piece of cake provided there is passwordless setup in place. Here is HCC article how to dynamically add a host to a cluster

If your cluster is already deployed and ready then you can use ambari to add a new host How-to-add-new_host

Once the host is added then you can deploy the desired service components/clients to that host.

But if you want to add a new host only using the Ambari APIs then you can refer to Deploy-components-using-APIs

Hope that helps


@Amit Bhardwaj

Any updates?

Hi @Geoffrey Shelton Okot,

Sorry for delay in response. I was betting on gmail to send me a notification of any updates on my question.

Thanks for writing a detailed answer to my questions. You have some valid points which I am definitely going to recommend for any new prod/QA setups.

We have 2 masters in our HA setup so that is fine. Both masters have ZOOKEEPER Server but as per your comments 3 are needed to form HA i.e. ENSEMBLE (Please correct me if my understanding is wrong)

We only have 1 worker node and as per your recommendation, I will suggest to use 3.

I want to explore the node addition using ambari blueprint APIs which will be fired using Ansible. I had previously tried to add a new node in HA setup in an existing cluster with very little success.

The new node addition request that I submitted manually was accepted as 200 OK. But, I couldn't see the node on Ambari GUI. I then checked the list of node known to ambari-server and found that the new node is listed but isn't part of the cluster. Command used to check hosts:

curl -u admin:admin

I added new node using this command:

curl -i -H "X-Requested-By: ambari" -u admin:admin -X POST -d @basic1.json<clustername>/hosts/<new-node>

and my basic1.json was:

    "blueprint" : "<cluster-name>_blueprint",
    "host_group" : "ae-master2"

Please suggest further. Thanks again.


@Amit Bhardwaj

If you have an HA setup then definitely you need an ensemble at least [3] zookeepers. For any distributed transactional system algorithm you need a quorum ( majority ).

Essentially a transaction is committed once more than 50% of nodes say that the transaction is committed. Therefore you need an uneven number. 3 nodes can survive 1 failure, 5 nodes can survive 2 failures, 7 nodes can survive 3 failures and so

Here is the explanation why you need a Zk ensemble

Concerning the new node, you added I suspect that you didn't install the ambari-agent on that node. Can you validate if you have this file /etc/ambari-agent/conf/ambari-agent.ini on that host? if it exists then make sure the entry for hostname points to your current ambari-server if not install see below


Install ambari-agent

First, ensure that the /etc/hosts is updated on all the hosts to include all the hosts in the cluster including the new host.

Ensure that the ambari-agent was installed and is correctly configured if not follow the below steps!

# yum install -y ambari-agent

Edit and correct the ambari-server name as above and start the agent

# ambari-agent start

After a few minutes, you should see the host as part of the cluster

Hey @Geoffrey Shelton Okot, I actually installed ambari-agent from the same repo as my active master. Also, made the correct IP address available in ambari-agent.conf file. That is the reason it showed the new node when I checked using ambari API. But the cluster name was missing for this new node.

For all other existing nodes, it shows the 'Clustername' key but not for the new node addition.

Anything further on this @Geoffrey Shelton Okot?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.