Created 12-06-2017 06:30 AM
Hi all,
I'm little bit confused with the development and production environment of Hadoop ecosystem.Can anyone brief about the tasks/workflows involved in both the environments ?
Thanks,
Created 12-06-2017 06:46 AM
Development Environment is basically for developers to test their application code/ implement various features and to validate if it works well.
Just like any other infrastructure project, it is always advised to users to build multiple environments. Not only is this a general best practice but it is also important because of the nature of Hadoop. Each project within the Apache Ecosystem is constantly changing and having a non-production environment (LAB -->DEV -->Test -->Staging -->Prod) to test upgrades and new functionality is vital.
Production environments requires more monitoring/notification capabilities, More efficient hardwares and more capable resources than Development environments as production environment deals with the realtime data with ideally zero downtime. So planning for production environment is much more needed including the Security features like Kerberos/LDAP/SSL setup etc.
Please refer to the following link to know more about the infrastructure setup: https://hortonworks.com/blog/apache-hadoop-infrastructure-considerations-and-best-practices/
HCC threads: https://community.hortonworks.com/questions/10664/best-practice-for-dev-qa-and-production-for-a-hado...
.
Cluster Planning: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_cluster-planning/content/ch_hardware-rec...
Created 12-06-2017 06:46 AM
Development Environment is basically for developers to test their application code/ implement various features and to validate if it works well.
Just like any other infrastructure project, it is always advised to users to build multiple environments. Not only is this a general best practice but it is also important because of the nature of Hadoop. Each project within the Apache Ecosystem is constantly changing and having a non-production environment (LAB -->DEV -->Test -->Staging -->Prod) to test upgrades and new functionality is vital.
Production environments requires more monitoring/notification capabilities, More efficient hardwares and more capable resources than Development environments as production environment deals with the realtime data with ideally zero downtime. So planning for production environment is much more needed including the Security features like Kerberos/LDAP/SSL setup etc.
Please refer to the following link to know more about the infrastructure setup: https://hortonworks.com/blog/apache-hadoop-infrastructure-considerations-and-best-practices/
HCC threads: https://community.hortonworks.com/questions/10664/best-practice-for-dev-qa-and-production-for-a-hado...
.
Cluster Planning: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_cluster-planning/content/ch_hardware-rec...
Created 12-11-2017 10:13 AM
What should be the cluster size in Development, Test and Production environment if i want to process 10 TB of Data on a daily basis? Also, how to manage this using Oozie on daily basis ?