Created 11-02-2021 06:45 AM
How many workers can CDSW cluster contain ?
Created 11-02-2021 08:28 AM
Hi, there is not a hard limit on the number of CDSW worker nodes you can have, however there are practical limits based - if you have say thirty nodes, there starts to be a lot more overhead in terms of network traffic and latency. For instance, each worker node will require about 3cpu and 5gb of ram just for the kubelet and internal CDSW pods - so if you have 30 worker nodes, you will be loosing 90cpu and 150gb of ram, which might not pay off. On larger clusters there is a delicate balance between how big your worker nodes are and how many worker nodes you choose to have - I can't really give much guidance on here other than it takes some trial and error to get right. If you have an account with Cloudera you should reach out to that team to get some more detailed information. Some rough guidelines would be to have workers between 32 and 64 vCPU, and have less than 20 of them....but, your mileage may vary. Hope this helps.
Created 11-02-2021 08:28 AM
Hi, there is not a hard limit on the number of CDSW worker nodes you can have, however there are practical limits based - if you have say thirty nodes, there starts to be a lot more overhead in terms of network traffic and latency. For instance, each worker node will require about 3cpu and 5gb of ram just for the kubelet and internal CDSW pods - so if you have 30 worker nodes, you will be loosing 90cpu and 150gb of ram, which might not pay off. On larger clusters there is a delicate balance between how big your worker nodes are and how many worker nodes you choose to have - I can't really give much guidance on here other than it takes some trial and error to get right. If you have an account with Cloudera you should reach out to that team to get some more detailed information. Some rough guidelines would be to have workers between 32 and 64 vCPU, and have less than 20 of them....but, your mileage may vary. Hope this helps.
Created 11-02-2021 08:46 AM
Thanks a lot for your answer
Created 11-02-2021 08:53 AM
Sure you are welcome. It is definitely an interesting topic but it's pretty hard to get some actual data, so much depends on the type of workloads you want to run, the size of your nodes, etc. Good luck!