- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Scalable HDP cluster on AWS
- Labels:
-
Hortonworks Data Platform (HDP)
Created on ‎03-03-2016 04:48 AM - edited ‎09-16-2022 03:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to setup an elastic cluster using AWS EC2 and install HDP on it. How can i do it. What are the options available.
I dont want to use AWS EMR. Is it possible to bring up and down datnodes with HDP stack installed on it automatically.
Any suggestions would be great.
Created ‎03-03-2016 04:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In hortonworks platform we have Cloudbreak. It is open source
http://sequenceiq.com/cloudbreak-docs/latest/
You can use it to launch Clusters on Amazon, Azure Google Cloud.
It needs a host to install the Cloudbreak software and then it will spin up the nodes for you.
One thing you have ti understand that if you have data in HDFS, it is not easy to bring down nodes. HDFS will kick off HDFS rebalance which will take time.
An elastic cluster will work well when you use a detached storage like Blob storage behind it.
Note scaling up is not an issue, it is scaling down that you will experience some rough time :).
Created ‎03-03-2016 04:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In hortonworks platform we have Cloudbreak. It is open source
http://sequenceiq.com/cloudbreak-docs/latest/
You can use it to launch Clusters on Amazon, Azure Google Cloud.
It needs a host to install the Cloudbreak software and then it will spin up the nodes for you.
One thing you have ti understand that if you have data in HDFS, it is not easy to bring down nodes. HDFS will kick off HDFS rebalance which will take time.
An elastic cluster will work well when you use a detached storage like Blob storage behind it.
Note scaling up is not an issue, it is scaling down that you will experience some rough time :).
Created ‎03-03-2016 05:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Shivaji. If i use s3 for storage , then it should be fine right. But my use case is we wont have the data on hdfs for a longer duration. it will be just for processing. Also we plan to have hbase also in the same cluster. we plan to have more of a static cluster for hbase. so say we will have abase cluster with 5 nodes and have hdfs/yarn/hbase on it and only the dfs and yarn on the elastic nodes. will it that be possible.
also is there a doc, tutorial or url where i can refer to set up an elastic cluster using cloudbreak and HDP
Created ‎03-03-2016 06:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://hortonworks.com/hadoop/cloudbreak/ - Check this video out.
If you use S3 you should be fine, except you will not get stellar performance. It will be slower than HDFS on local storage.
If you like the answer, you should hit "Accept" and give a vote :).
