Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Upgrading the cluster - OS, HDP, resources

Solved Go to solution

Upgrading the cluster - OS, HDP, resources

Explorer

Hi,

We have a development/test cluster with 4 nodes each having 12 cores vcpu, 2 TB SDD and 32GB RAM previously used for cassandra. All running on Ubuntu 14. It is configured as follows

1 NN

1 SNN

4 DN

Jobs are taking longer due to limited resources, so we want to upgrade the HDP version from 2.5 to 2.6 and OS to Ubuntu 16.04 and add more resources.

We are planning to upgrade as follows,

1. Increase RAM to 128 GB on all nodes

2. Double the storage by adding additional SAS disks

3. Add 2 more additional Data Nodes with 10 cores, 128 GB

Questions I got is

1. Is it a good practice to use all of them as DN, or we should separate the NN

2. Since it was borrowed off Cassandra, is it ok if we keep all nodes of same specs including 2 new nodes

3. There is no harm in mixed storage options ie. adding extra storage in SAS wrt existing SSD

4. With the proposed new cluster, upgrading to HDP 3.0 shouldn't be a issue, I guess

Thanks in advance

MB

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Upgrading the cluster - OS, HDP, resources

@MB My responses:

1. Yes more datanodes always help , you can have data replicated across nodes by choosing replication factor of your choice. Default is 3. Having DN and NN with sufficient resources on node is not a bad idea.Critical prod clusters could have a dedicated NN to avoid IO overheads caused by DNs creating any potential issues for NN.

2. Not sure the context behind Cassandra but having different spec nodes in a cluster should not be a problem. For better management of resources , try creating config groups of nodes and allocate / isolate better spec nodes to components which are critical for your use case or which may need more resources.

3. Research more but I dont think it should be a problem.

4. Should not be.

View solution in original post

2 REPLIES 2
Highlighted

Re: Upgrading the cluster - OS, HDP, resources

@MB My responses:

1. Yes more datanodes always help , you can have data replicated across nodes by choosing replication factor of your choice. Default is 3. Having DN and NN with sufficient resources on node is not a bad idea.Critical prod clusters could have a dedicated NN to avoid IO overheads caused by DNs creating any potential issues for NN.

2. Not sure the context behind Cassandra but having different spec nodes in a cluster should not be a problem. For better management of resources , try creating config groups of nodes and allocate / isolate better spec nodes to components which are critical for your use case or which may need more resources.

3. Research more but I dont think it should be a problem.

4. Should not be.

View solution in original post

Highlighted

Re: Upgrading the cluster - OS, HDP, resources

Explorer

Thanks Gaurav, it was helpful.

Don't have an account?
Coming from Hortonworks? Activate your account here