Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sizing for Master/Edges servers

Solved Go to solution

Sizing for Master/Edges servers

I'm helping a prospect expansion from current 6 nodes hadoop cluster to plans of more than 1PB and hundred nodes. I gave him some hints:

- master and edges nodes running in virtual environment (as they do not require high I/O and virtual environment can increase availability)

- knox as security perimeter gateway

- dedicated database nodes with high availability

I need help with recommended sizing and notes for items below:

- Master nodes, what is recommended RAM for master? Prospect asked me to consider that virtualized usually runs on machines with 512GB of RAM and usually they don't allocate more than 64GB virtual hosts.

- Edges nodes

- Knox, do we have any sizing for Knox?

- Database servers, do we have any sizing for dedicated database servers(for metadata: Ambari, Hue, Hive Metastore, Oozie, etc)?

Thanks.

Guilherme.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Sizing for Master/Edges servers

@Guilherme Braccialli

Please see this

Master nodes, what is recommended RAM for master? Prospect asked me to consider that virtualized usually runs on machines with 512GB of RAM and usually they don't allocate more than 64GB virtual hosts.

Comment: 256GB is good start - Each master node

If you are referring to 512GB in each physical node then 2 VM based on 1 bare metal

- Edges nodes

- Knox, do we have any sizing for Knox?

Comment: It' light weight instance so 64gb is good number (Depends how much traffic coming to the knox gateway)

- Database servers, do we have any sizing for dedicated database servers(for metadata: Ambari, Hue, Hive Metastore, Oozie, etc)?

Comment: Dedicated instance for DB is a good practice.

Memory: 128GB is a good start ( Mysql, Postgres, Oracle) ( in prod, its very important to have HA for DB)

CPU - dual 8 core or quad core if possible ( Assuming large cluster)

View solution in original post

3 REPLIES 3
Highlighted

Re: Sizing for Master/Edges servers

Could you please provide some more details about the services that will be deployed and used? Hive? HBase? Spark?

Highlighted

Re: Sizing for Master/Edges servers

@Jonas Straub initially only hive, but in the future Hbase, Solr and Spark also. Prospect does not have all the details yet like number of users, amount of data, etc. So far overall guidelines and basic calculations that lead to number of hosts will help a lot. Thanks.

Highlighted

Re: Sizing for Master/Edges servers

@Guilherme Braccialli

Please see this

Master nodes, what is recommended RAM for master? Prospect asked me to consider that virtualized usually runs on machines with 512GB of RAM and usually they don't allocate more than 64GB virtual hosts.

Comment: 256GB is good start - Each master node

If you are referring to 512GB in each physical node then 2 VM based on 1 bare metal

- Edges nodes

- Knox, do we have any sizing for Knox?

Comment: It' light weight instance so 64gb is good number (Depends how much traffic coming to the knox gateway)

- Database servers, do we have any sizing for dedicated database servers(for metadata: Ambari, Hue, Hive Metastore, Oozie, etc)?

Comment: Dedicated instance for DB is a good practice.

Memory: 128GB is a good start ( Mysql, Postgres, Oracle) ( in prod, its very important to have HA for DB)

CPU - dual 8 core or quad core if possible ( Assuming large cluster)

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here