Posts: 26
Registered: ‎06-19-2014

Cluster creation for Cloudera CDH5 installation

I have a server machine with the following configuration.

1) Dual Quad Core Xeon

2) 24 GB Memory

3) 500 GB Sata

4) 256 * 2 RAID 1

The machine just arrived and we want to install CDH5 in it. We want to create a sand box / dev cluster. I am looking for some expert advice on

A) How many nodes we can create? We are targeting 4 nodes. Is that advisable.
B) I read Cloudera manager should be residing in the node with solid configuration.   Based on our configuration how much resource should be allocated for the same.
C) We will install Ubuntu 12.4.
We are fairly new to this process. Any help would be really helpful.

Thanks, Amit

Cloudera Employee
Posts: 62
Registered: ‎07-29-2013

Re: Cluster creation for Cloudera CDH5 installation

Hi Amit,
Welcome to Hadoop land!:-)

The answers really depend on how much data you are going to store and what
you are going to do with the data. In any case, my recommendation would be
to use Cloudera Manager to set up, install, manage and monitor CDH5.

This is a good starting point.

Also, Ubuntu 12.04 is well supported by Cloudera software so life is
looking good there as well.

I'd recommend running the CM server on 1 node, say node 1. Running,
namenode on a different server, say node 2.

You can usually install a datanode on node 1, that's should be ok.

Good luck and let us know how it goes!

Posts: 26
Registered: ‎06-19-2014

Re: Cluster creation for Cloudera CDH5 installation

Wow Great !!!!

Thanks for your reply... mgrover

We are building a POC model hence the data is not really huge. Data size is 4-5 hundred MBs.

We are building ths as dev environment. Just got one server with the above mentioned configuration.

We have couple of basic questions

1) Ours is a dual processor quad core each  so total 8 core machine. And we are planning to create 4 node in this machine. To do this as per my information we need to create the nodes seperately as step one. Is this a correct understanding ? Or Cloudera manager can help to create the nodes.

2) Also I read we should allocate more resource to the cloudera manager node. In our case we are planning for 6Gig RAM is that ok?

3) Configuration as per plan is :

Node 1 : CM------------- 8 GB RAM

Node 2: Namenode-------------- 6 GB RAM

Node 3: Datanode---------------5 GB

Node 4: Datanode---------------5 GB


Apart from the above mentioned configuration is there any specific requirement on the hard drives like CM should be on the RAID and Name node should be on RAID/SATA...


Thanks ,