Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

what is the production environment in Hadoop? Is it nothing but deploying the cluster? Also, what are the different tasks preformed by Developers in Production and Development environements?

avatar
Contributor

Hi all,

I'm little bit confused with the development and production environment of Hadoop ecosystem.Can anyone brief about the tasks/workflows involved in both the environments ?

Thanks,

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Rakesh AN

Development Environment is basically for developers to test their application code/ implement various features and to validate if it works well.

Just like any other infrastructure project, it is always advised to users to build multiple environments. Not only is this a general best practice but it is also important because of the nature of Hadoop. Each project within the Apache Ecosystem is constantly changing and having a non-production environment (LAB -->DEV -->Test -->Staging -->Prod) to test upgrades and new functionality is vital.

Production environments requires more monitoring/notification capabilities, More efficient hardwares and more capable resources than Development environments as production environment deals with the realtime data with ideally zero downtime. So planning for production environment is much more needed including the Security features like Kerberos/LDAP/SSL setup etc.

Please refer to the following link to know more about the infrastructure setup: https://hortonworks.com/blog/apache-hadoop-infrastructure-considerations-and-best-practices/

HCC threads: https://community.hortonworks.com/questions/10664/best-practice-for-dev-qa-and-production-for-a-hado...

.

Cluster Planning: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_cluster-planning/content/ch_hardware-rec...

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@Rakesh AN

Development Environment is basically for developers to test their application code/ implement various features and to validate if it works well.

Just like any other infrastructure project, it is always advised to users to build multiple environments. Not only is this a general best practice but it is also important because of the nature of Hadoop. Each project within the Apache Ecosystem is constantly changing and having a non-production environment (LAB -->DEV -->Test -->Staging -->Prod) to test upgrades and new functionality is vital.

Production environments requires more monitoring/notification capabilities, More efficient hardwares and more capable resources than Development environments as production environment deals with the realtime data with ideally zero downtime. So planning for production environment is much more needed including the Security features like Kerberos/LDAP/SSL setup etc.

Please refer to the following link to know more about the infrastructure setup: https://hortonworks.com/blog/apache-hadoop-infrastructure-considerations-and-best-practices/

HCC threads: https://community.hortonworks.com/questions/10664/best-practice-for-dev-qa-and-production-for-a-hado...

.

Cluster Planning: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_cluster-planning/content/ch_hardware-rec...

avatar
Contributor

@Jay Kumar SenSharma

What should be the cluster size in Development, Test and Production environment if i want to process 10 TB of Data on a daily basis? Also, how to manage this using Oozie on daily basis ?