Created on 07-10-2019 11:31 AM - edited 09-16-2022 08:34 AM
Assuming the latest version of Altus Director... are there any documentated recommendations for the number of deployments/clusters that Director can reasonbly be expected to manage?
Our clusters are less than 21 nodes each and our Director server is running on an EC2 instance with 8 cores and 60GB of RAM and I have a dedicated RDS MariaDB DB sized at 2 cores and 16GB of RAM with 200GB of storage. Thus far I'm managing around 14 clusters from ranging in size from 9 to 20 nodes. No issues thus far. I'm just curious if there is general guidance around this subject.
Created 07-11-2019 08:49 AM
dturner,
We don't have any documented recommendations for the max deployments or clusters. Anecdotally, I've heard of 100+ clusters. Director does the most work when bootstrapping/updating/terminating clusters and less when just monitoring clusters.
You are also running on a way larger instance than we usually use so I bet there's room for some tuning if you do run into issues (e.g., increasing memory, threadpools, etc).
Created 07-11-2019 08:49 AM
dturner,
We don't have any documented recommendations for the max deployments or clusters. Anecdotally, I've heard of 100+ clusters. Director does the most work when bootstrapping/updating/terminating clusters and less when just monitoring clusters.
You are also running on a way larger instance than we usually use so I bet there's room for some tuning if you do run into issues (e.g., increasing memory, threadpools, etc).
Created 07-11-2019 08:55 AM
Thank you for the response.
To your point about bootstrapping/terminating - I have noticed during testing when I'm cycling through many bootstrapping / terminations that Director can get into a state where it seems to no longer respond to bootstrapping requests (from command line). I'll typically let Director "cool down" for a few minutes and retry.
Created 07-18-2019 09:03 AM
From experience Director will really lag as you scale higher up. We are at 5 environments with about 40 deployments spanning thousand of nodes.
Make sure you go into application.properties and increase the amount of threads for tasks. That really makes a difference from what I've been seeing.