Support Questions

Find answers, ask questions, and share your expertise

Sub-sequent actions when amazon ec2 instance dead in hadoop cluster?

avatar
Expert Contributor

We are running hortonworks cluster with 18 nodes with amazon ec2 instances. We can't guarantee the ec2 instances availability for a long time. Let us consider a case, two of my nodes dead in the cluster. What are the subsequent actions to be taken to add new instances. Do you guys tried hdp with amazon instances, have any procedure templates ?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Ram D

If data nodes goes down then you can add new data nodes. Ambari can add data nodes without any outage once you launch the vm and meet the prerequisites.

Now, if master goes down then plan of action changes based on services allocated to those vm. You can make new nodes part of the cluster and add master services on new nodes

Thats why , we need HA + DR " HA and DR is different 😉 "

View solution in original post

5 REPLIES 5

avatar
Master Mentor

you need to put these nodes in excludes list and add new nodes using Ambari. @Ram D

avatar
Master Mentor
@Ram D

If data nodes goes down then you can add new data nodes. Ambari can add data nodes without any outage once you launch the vm and meet the prerequisites.

Now, if master goes down then plan of action changes based on services allocated to those vm. You can make new nodes part of the cluster and add master services on new nodes

Thats why , we need HA + DR " HA and DR is different 😉 "

avatar
Master Mentor

@Ram D I have used cloud watch and its very helpful ..see this

You may want to look into periscope too

avatar
Expert Contributor

@ Artem Ervits I am asking in the view of amazon ec2 instance. I know the procedure of decommissioning and including in the exclude host list. I would like to know is there any procedure template if some node goes down, do these actions like that as a procedure template.

avatar
Expert Contributor

@Neeraj We implemented HA for both namenode and ResourceManager. They are working fine. Let us consider a case, if one name node goes down, we can create another instance of same configuration n add to the cluster n install the services. Do you have any procedure templates for subsequent actions after one node is dead?