Support Questions
Find answers, ask questions, and share your expertise

Single cluster cloudera manager for dev, test and prod enviornments

Highlighted

Single cluster cloudera manager for dev, test and prod enviornments

Explorer

Hi,

 

This might seem like a silly question but i want to make sure we are not missing any important points before making a decesion with our cluster set.

 

This was proposed in our organization - Single Cloudera manager for dev, test and PROD enviornments on a single cluster. 

I know its not the correct way of doing it and we think we should atlest have the PROD enviornment a seperate cluster.

 

Other than the listed below disadvantages what others are we missing ?

 

 Resource sharing -

  • All 3 environments (dev, test & prod) will share the same hardware resources
  • It may be possible, production jobs may not get the required resources because of high resources utilization by dev and /or test environments.
  • We cannot control the resource utilization based on environments or can we ?

Services sharing:

  • All 3 environments will be connected through the single Cloudera Manager (not sure if can deploy 3 Cloudera Managers on single cluster, 1 for each environment)
  • Any service failure due to dev / test issues will cascade impact on production and vice versa

Maintenance:

  • Maintenance schedule will be pain, as need to always co-ordinate between 3 groups (dev, test & prod)
  • Backup and DR will be complex as need to setup very precisely on selected folders (though Cloudera doing setup, future maintenance will be complex)

Security:

  • We are not sure how Kerberos will behave in environment co-location approach, we need advice!
2 REPLIES 2
Highlighted

Re: Single cluster cloudera manager for dev, test and prod enviornments

Master Guru
> Other than the listed below disadvantages what others are we missing ?

My own recommendation: Do not do this, it never ends well in long term. Separate out the hardware into two or three clusters instead for actual reliable production environments. Many of your points are already well thought out, so I'll just focus on answering some specific questions.

> We cannot control the resource utilization based on environments or can we ?

YARN's Dynamic Resource Management and Impala's Admission Control allow you to define queues and specify SLA-styled resource constraints on them, so this would give you some control over prioritising one set of queues (prod queues) over other in-use queues. FairScheduler supports preemption so it can take kill-style action against a hogging application to guarantee a prod queue full minimum guaranteed resources.

However, most of the resource management, unless cgroups is also carefully planned for and tuned, is applied at the logical (cluster-scheduling) level. This means that if someone were to, under Dev conditions, run a rogue task that abuses CPU resources, it will not be acted against as there's no hard-limits applied there. Such mistakenly run applications can cause impact on your other critical applications.

Read more at: Multi-tenancy in CDH clusters: https://www.cloudera.com/documentation/enterprise/latest/topics/admin_howto_multitenancy.html

> We are not sure how Kerberos will behave in environment co-location approach, we need advice!

A single CM instance can currently only manage one Kerberos Realm for all clusters and their services running under it. All of your clients would therefore access via the same realm. This does not lend good distinction and control.
Highlighted

Re: Single cluster cloudera manager for dev, test and prod enviornments

Explorer
Thanks Harsh this helps, sorry was on vacation so couldn't reply earlier.
Don't have an account?