Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Best Practice for Dev, QA and Production for a Hadoop Cluster

Solved Go to solution

Best Practice for Dev, QA and Production for a Hadoop Cluster

Since Hadoop is not a typical Enterprise software, we are having trouble getting the QA team to understand how it fits into our application landscape. They would like us to have three separate environments for Dev, QA and Production. Do you typically see this, or do you have any best practice documentation that we could provide to them?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

@Ancil McBarnett I see this pattern in all my customers. Dev tends to be small. Sometimes dev is comprised of only sandbox instances and is almost always a virtual environment. Test mimics prod in all configuration aspects but tends to be about 30%-50% prod capacity.

Upgrades, configuration changes, patching, tech previews all occur in the test environment prior to any production rollout. In the end, Hadoop isn't much different than other platforms as far as this is concerned.

View solution in original post

6 REPLIES 6
Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

@Ancil McBarnett

Yes to 3 environment

Dev and Qa does not need to as big as prod.

DR is required too and we can use DR for reporting

Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

@Ancil McBarnett I see this pattern in all my customers. Dev tends to be small. Sometimes dev is comprised of only sandbox instances and is almost always a virtual environment. Test mimics prod in all configuration aspects but tends to be about 30%-50% prod capacity.

Upgrades, configuration changes, patching, tech previews all occur in the test environment prior to any production rollout. In the end, Hadoop isn't much different than other platforms as far as this is concerned.

View solution in original post

Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

You have to decide how many clusters you need for the below tasks which apply to Hadoop applications the same was as they apply to typical Enterprise software:
  1. Test upgrade procedures for new versions of existing components
  2. Execute performance tests of custom-built applications
  3. Allow end-users to perform user acceptance testing
  4. Execute integration tests where custom-built applications communicate with third-party software
  5. Experiment with new software that is beta quality and may not be ready for usage at all
  6. Execute security penetration tests (typically done by an external company)
  7. Let application developers modify configuration parameters and restart services on short notice
  8. Maintain a mirror image of production environment to be activated in case of natural disaster or unforeseen events
  9. Execute regression tests that compare the outputs of new application code with existing code running in production

I believe, DEV -> QA -> PROD is a minimum and I have seen larger organizations deploy LAB -> DEV -> QA -> PROD -> DR as separate clusters.

Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

Mentor

@Ancil McBarnett please accept best answer

Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

Explorer

@Neeraj Sabharwal

Could you elaborate further on how DR cluster can be used for reporting?

Many thanks

Highlighted

Re: Best Practice for Dev, QA and Production for a Hadoop Cluster

New Contributor

Having 4 environments including development, testing, pre-production/staging and production in a Big company would be good for best practices because in staging we can make sure that all are working properly. Of course the dev, testing and staging environments are smaller than planned production. For instance, if I take 2 nodes in dev, testing and staging then we can have a almost 8 nodes in production and again it's always depends on replication, traffic, and other relevant facts. Thanks!

Don't have an account?
Coming from Hortonworks? Activate your account here