Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to use Chaos Monkey in Ambari cluster setup?

Solved Go to solution
Highlighted

How to use Chaos Monkey in Ambari cluster setup?

Is there any tool available like "Chaos Monkey" to use in Ambari cluster setup. I am trying to test the HA. What is the best way to have it tested in my cluster?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How to use Chaos Monkey in Ambari cluster setup?

Good Q. Not explicitly, AFAIK. We do have a integral chaos monkey in Slider (incubating), which you just turn on, give a sleep time and then schedule multiple actions (worker death, AM death).

If you are working with an EC2 cluster, you can just use Netflix's Chaos Monkey lib and have it do the killing.

Otherwise, the general best practise is to have something automated to SSH in and find/kill processes. I don't have any up to date code for this; I used to somewhere but it relates to older linux versions, and has probably aged now. I'm afraid you'll have to look around online for that.

What is really, really slick for testing HA failover is code to turn real/virtual network switches off. This is good as it lets you rigorously test what happens if there's a network partition and everything stays running, just unreachable.

Pro tip: issuing a kill -SIGSTOP is a great way to simulate a hung (as opposed to a failed) process.

View solution in original post

3 REPLIES 3
Highlighted

Re: How to use Chaos Monkey in Ambari cluster setup?

Good Q. Not explicitly, AFAIK. We do have a integral chaos monkey in Slider (incubating), which you just turn on, give a sleep time and then schedule multiple actions (worker death, AM death).

If you are working with an EC2 cluster, you can just use Netflix's Chaos Monkey lib and have it do the killing.

Otherwise, the general best practise is to have something automated to SSH in and find/kill processes. I don't have any up to date code for this; I used to somewhere but it relates to older linux versions, and has probably aged now. I'm afraid you'll have to look around online for that.

What is really, really slick for testing HA failover is code to turn real/virtual network switches off. This is good as it lets you rigorously test what happens if there's a network partition and everything stays running, just unreachable.

Pro tip: issuing a kill -SIGSTOP is a great way to simulate a hung (as opposed to a failed) process.

View solution in original post

Highlighted

Re: How to use Chaos Monkey in Ambari cluster setup?

I am using a our own unix instances on AWS which is not exactly EC2 type. I have installed ambari. Can you please let me know the steps to enable it.

Highlighted

Re: How to use Chaos Monkey in Ambari cluster setup?

Like I said, if you are running on EC2, you should be able to play with Netflix's Chaos Monkey direct. I haven't used it for a while; https://github.com/Netflix/SimianArmy/wiki/Quick-Start-Guide covers starting it....I think it's got more complex than in the early days, when it was more of a CLI thing

Don't have an account?
Coming from Hortonworks? Activate your account here