Support Questions
Find answers, ask questions, and share your expertise

Cloudera Headnode is too slow

New Contributor

Hi there,

I have a cluster of 8 machines :

head node: Intel core i5 with 8GB memory.

1 i5 machine with 2GB memory

6 Dual core machines with 2-4 GB memory.

I installed CDH5 with Cloudera management services, and hive, HDFS, Hive, Oozie and Yarn (MR2)

My problem is that the system is quite slow and the memory of the headnode is already full even though I didn't insert any data yet in hdfs.

Any advice or suggestoin on how to make it faster is very much appreciated!


Kind regards


Re: Cloudera Headnode is too slow

Those memory sizes are way to low. You can't run anything serious with such small memory sizes.


Anything under 4 GB is pretty useless, and even 4GB might be barely usable for very simple tasks, but anything serious will fail.


8GB might be ok for the master host, since you aren't deploying all services. Still on the low side though, and you'll probably see issues as your cluster activity grows.

Re: Cloudera Headnode is too slow

New Contributor
thank you for the quick reply,
Would you think that working on original hadoop 2 without cloudera would be faster and wouldn't require this much memory?

Kind regards

Re: Cloudera Headnode is too slow

By "without cloudera" I assume you mean "without Cloudera Manager". There's no memory benefit to using upstream Hadoop versus Cloudera's Distribution of Hadoop. Running Cloudera Manager only really takes up resources on the hosts that have Cloudera Manager and the monitoring daemons, each of which take up notable memory. On small clusters, this is by default the first (largest) host, which I would expect is your 8 GB host. For small clusters with low activity, I would expect that to function ok, but definitely not in a production cluster.


2 gigs on a host is just not enough to run your dataNode + taskTracker + MR job + operating system unless your MR job is tiny. It won't work for a production cluster.


So I don't expect you to see notable speed benefits running without Cloudera Manager. You should get larger machines if you want to run in production.

Re: Cloudera Headnode is too slow

New Contributor
right, I'm really surprised to hear this as I thought hadoop was designed to run on small commodity machines where no huge memory required.

By the way my knowledge in Hadoop is quite limited and I'm doing an experiment to compare couchbase execution/performance on my cluster with the execution of hadoop on the same cluster and the same specification. Do you think this is not feasible on the current hardware?
Or do you think I should compare couchbase with the related/same purpose product from hadoop, such as Hbase, but still need to install hdfs, right?

Re: Cloudera Headnode is too slow

4GB isn't really huge memory. The chepest possible consumer desktop from Dell has 4GB or more RAM already. 1TB is probably considered huge.


I would not compare couchbase with mapreduce. MapReduce is more for batch processing, whereas couchbase is a NoSQL database optimized for latency. MapReduce will never give you subsecond response times. HBase will be a more reasonable comparison with couchbase. HBase requires HDFS.


I'm not a hardware or performance testing expert, so I can't really say what exactly you'd need to do your test, but I would strongly suspect that your test is not feasible on the current hardware. You have to at least run HDFS + HBase daemons on your slave nodes, which take up 1 and 4 gigs by default (at least, the defaults Cloudera Manager uses). Leaving some for the OS, that's at least 6 gigs of RAM to run with default configurations. Performance tuning, depending on your workload and whatever experts / books say, could change this further.