Created on 03-03-2016 04:48 AM - edited 09-16-2022 03:06 AM
I want to setup an elastic cluster using AWS EC2 and install HDP on it. How can i do it. What are the options available.
I dont want to use AWS EMR. Is it possible to bring up and down datnodes with HDP stack installed on it automatically.
Any suggestions would be great.
Created 03-03-2016 04:52 AM
In hortonworks platform we have Cloudbreak. It is open source
http://sequenceiq.com/cloudbreak-docs/latest/
You can use it to launch Clusters on Amazon, Azure Google Cloud.
It needs a host to install the Cloudbreak software and then it will spin up the nodes for you.
One thing you have ti understand that if you have data in HDFS, it is not easy to bring down nodes. HDFS will kick off HDFS rebalance which will take time.
An elastic cluster will work well when you use a detached storage like Blob storage behind it.
Note scaling up is not an issue, it is scaling down that you will experience some rough time :).
Created 03-03-2016 04:52 AM
In hortonworks platform we have Cloudbreak. It is open source
http://sequenceiq.com/cloudbreak-docs/latest/
You can use it to launch Clusters on Amazon, Azure Google Cloud.
It needs a host to install the Cloudbreak software and then it will spin up the nodes for you.
One thing you have ti understand that if you have data in HDFS, it is not easy to bring down nodes. HDFS will kick off HDFS rebalance which will take time.
An elastic cluster will work well when you use a detached storage like Blob storage behind it.
Note scaling up is not an issue, it is scaling down that you will experience some rough time :).
Created 03-03-2016 05:12 AM
Thanks @Shivaji. If i use s3 for storage , then it should be fine right. But my use case is we wont have the data on hdfs for a longer duration. it will be just for processing. Also we plan to have hbase also in the same cluster. we plan to have more of a static cluster for hbase. so say we will have abase cluster with 5 nodes and have hdfs/yarn/hbase on it and only the dfs and yarn on the elastic nodes. will it that be possible.
also is there a doc, tutorial or url where i can refer to set up an elastic cluster using cloudbreak and HDP
Created 03-03-2016 06:42 AM
http://hortonworks.com/hadoop/cloudbreak/ - Check this video out.
If you use S3 you should be fine, except you will not get stellar performance. It will be slower than HDFS on local storage.
If you like the answer, you should hit "Accept" and give a vote :).