Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Computing Nodes Scalability:


Computing Nodes Scalability:

Expert Contributor


I am looking for a guide to scale the computing nodes. Any detail process or steps.

Any response is highly appreciated.




Re: Computing Nodes Scalability:

Super Guru

@sujitha sanku

Are you looking to scale your cluster as the capacity increases or "scaling compute nodes" as in performance. It depends on a number of factors.

1. What compute engines are being used (Hive, Spark, HBase, SOLR etc)?

Scaling HBase is very different than scaling Hive. Hardware profiles tend to be different for these engines.

2. Are you able to meet SLA's in your current cluster?

If yes, then you should be able to scale linearly by adding nodes similar to what you have in current cluster.

If no, then you need to first have a base line, a cluster that meets your requirements (capacity, performance SLAs). This can mean a lot of things. May be you need more spindles per node. Do you have dense nodes, for example 48 TB plus on each node or more industry standard cluster with 12x2TB disks). More disks mean more parallelism. I'll send you an email on tradeoffs between choosing dense or less dense nodes.

Before you scale you need to first determine what is the size of cluster works for you, performance wise. Sizing for capacity is the easy part as it's simple maths. Sizing for performance, however, should be done by doing some prototype and then assuming linear scalability.

Hope this helps.

Re: Computing Nodes Scalability:

Super Guru

@sujitha sanku You can get this information by reviewing the cluster planning here

Also I would recommend using how to configure and sizing your cluster here