Created 10-23-2016 08:04 AM
Hi all,
I am a newbie to HDP and cloudbreak. I want to move some of our onsite Hadoop clusters/jobs on AWS. Two solutions that I have came-across are Cloudbreak and EMR, however not sure which one to use.
I wanted to know which technology to use for launching hadoop jobs on AWS? Pros and cons of using either approach would be really helpful (interms of cost, ease of use, monitoring, metrics, latency etc). One apparent cost optimization feature that I am interested in : is to launch the cluster whenever a job or jobs needs to run, and kill the cluster/nodes whenever there are no more jobs to execute.
Thanks
Obaid
Created 10-25-2016 09:23 PM
Hi @Obaid Salikeen,
Pros:
Cons:
Disclaimer: I am an engineer working on Cloudbreak
Attila
Created 10-24-2016 08:19 PM
Hi @Obaid Salikeen, You may also consider using Hortonworks Data Cloud (currently in technical preview stage. See http://hortonworks.github.io/hdp-aws/.
Created 10-24-2016 11:22 PM
Thanks @Dominika B,
Thanks for sharing the link, seems interesting.
So I have a very basic question: Amazon EMR lets you launch manage Hadoop and Spark clusters, so what would be the benefit of using Hortonworks cloud vs just using EMR?
Thanks
Obaid
Created 10-25-2016 09:23 PM
Hi @Obaid Salikeen,
Pros:
Cons:
Disclaimer: I am an engineer working on Cloudbreak
Attila
Created 10-29-2016 06:20 PM
Thanks a lot @Attila Kanto for a detailed response,
Let me ask another cost related question, which is an important factor for making a decision on which technology to use: How would you compare EMR vs Cloudbreak (or Hortonworks Data Cloud) in-terms of cost?
Obaid
Created 10-29-2016 07:55 PM
Sorry, but I do not have such comparison.
Attila
Created 11-03-2016 01:35 PM
sure, no problem