- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to use Chaos Monkey in Ambari cluster setup?
- Labels:
-
Apache Ambari
Created on ‎01-05-2017 01:07 PM - edited ‎09-16-2022 03:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any tool available like "Chaos Monkey" to use in Ambari cluster setup. I am trying to test the HA. What is the best way to have it tested in my cluster?
Created ‎01-06-2017 08:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good Q. Not explicitly, AFAIK. We do have a integral chaos monkey in Slider (incubating), which you just turn on, give a sleep time and then schedule multiple actions (worker death, AM death).
If you are working with an EC2 cluster, you can just use Netflix's Chaos Monkey lib and have it do the killing.
Otherwise, the general best practise is to have something automated to SSH in and find/kill processes. I don't have any up to date code for this; I used to somewhere but it relates to older linux versions, and has probably aged now. I'm afraid you'll have to look around online for that.
What is really, really slick for testing HA failover is code to turn real/virtual network switches off. This is good as it lets you rigorously test what happens if there's a network partition and everything stays running, just unreachable.
Pro tip: issuing a kill -SIGSTOP is a great way to simulate a hung (as opposed to a failed) process.
Created ‎01-06-2017 08:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good Q. Not explicitly, AFAIK. We do have a integral chaos monkey in Slider (incubating), which you just turn on, give a sleep time and then schedule multiple actions (worker death, AM death).
If you are working with an EC2 cluster, you can just use Netflix's Chaos Monkey lib and have it do the killing.
Otherwise, the general best practise is to have something automated to SSH in and find/kill processes. I don't have any up to date code for this; I used to somewhere but it relates to older linux versions, and has probably aged now. I'm afraid you'll have to look around online for that.
What is really, really slick for testing HA failover is code to turn real/virtual network switches off. This is good as it lets you rigorously test what happens if there's a network partition and everything stays running, just unreachable.
Pro tip: issuing a kill -SIGSTOP is a great way to simulate a hung (as opposed to a failed) process.
Created ‎01-07-2017 10:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using a our own unix instances on AWS which is not exactly EC2 type. I have installed ambari. Can you please let me know the steps to enable it.
Created ‎01-09-2017 09:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Like I said, if you are running on EC2, you should be able to play with Netflix's Chaos Monkey direct. I haven't used it for a while; https://github.com/Netflix/SimianArmy/wiki/Quick-Start-Guide covers starting it....I think it's got more complex than in the early days, when it was more of a CLI thing
