- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Install CDH5 on EC2 without human interaction
Created on ‎09-01-2014 02:40 AM - edited ‎09-16-2022 02:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all!
At the company I work, we're currently using a 4 node Amazon EMR cluster together with S3 for all our data warehousing and analysis needs. The cluster gets spin-up each morning and torn down each evening automatically through a cron job running on another server, to save costs.
We're using Impala exstensively. Our data is copied each morning from S3 to HDFS after the cluster has been spun up.
I was looking at installing Hue to provide a nice interface for querying Impala. Then it occurred to me that it would probably be easier to move from EMR to EC2 and install CDH5 on there. Ideally we would use Cloudera Manager for monitoring the cluster while it's running.
The problem: is there a way to install CDH5, including Cloudera Manager, automatically on an EC2 cluster, without human interaction?
Created ‎09-02-2014 03:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
manually and save the master and worker node images as custom AMIs. Use
those AMIs every morning to create a new cluster, then tear it down. When
you want to update CDH, just do it once manually and save new AMIs
Gautam Gopalakrishnan
Created ‎09-02-2014 03:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@GautamG wrote:
I have to check on that. Meanwhile another option is to create the cluster
manually and save the master and worker node images as custom AMIs. Use
those AMIs every morning to create a new cluster, then tear it down. When
you want to update CDH, just do it once manually and save new AMIs
Hmmm, that is actually a great idea! It certainly is the least-effort solution so far 🙂 I will look into that this afternoon.
Meanwhile, with regards to Whirr, it seems the documentation I pointed to, is outdated. If you look at the sample whirr config in the whirr-cm repo, it supports YARN roles: https://raw.githubusercontent.com/cloudera/whirr-cm/master/cm-ec2.properties
Created ‎09-02-2014 03:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gautam Gopalakrishnan
Created ‎09-02-2014 03:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your help!
Created ‎10-14-2014 06:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might be pleased to know that a new product called Cloudera Director has been released that enables you to quickly spin up clusters on Cloud platforms like AWS. Amazon also supports the creation of EDH clusters within AWS QuickStart, this uses Cloudera Director in the background. You can read more in the following links:
http://aws.amazon.com/quickstart/
http://www.cloudera.com/content/cloudera/en/products-and-services/director.html
Gautam Gopalakrishnan
Created ‎10-15-2014 03:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I noticed! Very cool development!

- « Previous
-
- 1
- 2
- Next »