Reply
Highlighted
New Contributor
Posts: 4
Registered: ‎07-21-2015
Accepted Solution

Minimum Recommended Configuration for Setting up a Sandbox Environment

[ Edited ]

Dear All,

 

I would like to know what is the minimum recommended configuration for setting up a sandbox / training environment for Cloudeara Big data platform,

 

For example lets assume that, if i want to have some 500 active users, who level of experiences varies from novice to expert, then whats the recommended  configuration,

 

- How many name nodes & whats the configuration ?

- How many data nodes and its corresponding configurations ?

 

Thank You !

Cloudera Employee
Posts: 445
Registered: ‎08-11-2014

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

That's really a how-long-is-a-piece-of-string question, and is unanswerable like this. The number of users doesn't matter; it does matter how much data you are storing and what you're doing with it during training, and how many people are using it at once.

New Contributor
Posts: 4
Registered: ‎07-21-2015

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

Thanks for the quick revert. I understand that.

Let me rephrase my question, Assuming that i would like to allocate a quota of some 100 MB per users, and would like to have some 200 concurrent users, could you please suggest.

May be its very primitive, Sorry for that just wanted to set things with the recommended config.

Thank You !
Cloudera Employee
Posts: 445
Registered: ‎08-11-2014

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

100MB is very tiny; that's a total of 20GB of storage. A single drive these days carries 50 times as much storage.

 

Do you mean 100GB? Then you need 20TB storage. That fits into a couple machines. 

 

It's still really the workload that drives your requirements though, so I don't know, but based on storage alone (assuming you meant GB) this sounds like a cluster of 2-3 big commodity machines.

New Contributor
Posts: 4
Registered: ‎07-21-2015

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

Thanks Again.

 

Just wanted to know one more info..

 

How much ram will be consumed on a master (name node) when all the below services are running,

 

  1. HBASE
  2. HDFS
  3. HIVE
  4. SOLR
  5. SPARK
  6. Key Value Indexer
  7. Zoo Keeper
  8. Oozie
  9. Impala
  10. Yarn
  11. Sqoop2

Assuming there is absolutely no operations running.

 

 

 

Cloudera Employee
Posts: 445
Registered: ‎08-11-2014

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

Again, it depends on what you are doing with them. The daemons might consume a few gigabytes as a baseline, each; of course, they can consume hundreds of gigabytes if you're running workloads that need a bunch of memory.

New Contributor
Posts: 4
Registered: ‎07-21-2015

Re: Minimum Recommended Configuration for Setting up a Sandbox Environment

Thank You !

Announcements

Currently incubating in Cloudera Labs:

Envelope
HTrace
Ibis
Impyla
Livy
Oryx
Phoenix
Spark Runner for Beam SDK
Time Series for Spark
YCSB