Created on 11-16-2016 12:46 AM - edited 09-16-2022 01:36 AM
Hortonworks Data Cloud for AWS (HDCloud for AWS) allows you to create on-demand ephemeral Hadoop clusters on AWS.
In this tutorial, we will set up Hortonworks Data Cloud on AWS 1.16 (released in June 2017), including:
This tutorial assumes no prior experience with AWS. Still, if you run into any issues, refer to the Troubleshooting documentation.
1. Set up an AWS account: In order to launch HDCloud on AWS, you need to have an AWS account. You can set one up at https://aws.amazon.com/. Creating an AWS account is free, but you need to add a credit card that will be charged once you start running AWS services. Alternatively, you may want to contact your IT to find out if your company has an account to which you can be added.
2. Select an AWS region: Next, decide in which region you would like launch the cloud controller and clusters. The following regions are supported:
You may want to pick the region that is nearest to your location - unless you have other constraints. For example, if you have data in Amazon S3 that you will later want to access from your cluster, you will want your clusters to be in the same region as your Amazon S3 data.
3. Create an SSH key pair: Once you’ve decided which region to use, you need to create an SSH keypair in that region. To do that:
Now that you've met all the prerequisites, you can subscribe to HDCLoud services on AWS Marketplace. Let's get started!
In order to use the product, you need to subscribe to two AWS marketplace services. You can access them by searching the https://aws.amazon.com/marketplace/ or by clicking on these links:
For each of the services, you need to:
1. Open the listing,
2. Click CONTINUE.
3. Click ACCEPT SOFTWARE TERMS.
This will add these two services to Your Software. You are all set to launch the cloud controller!
1. Navigate to the Hortonworks Data Cloud - Controller Service listing page:
2. The only setting that you need to review and change is the Region, which needs to be the same as the region that you chose in the prerequisites.
3. Click on Launch with CloudFormation Console and you will be redirected to the Create stack form in the CloudFormation console.
4. On the Select Template page, your template link is already provided, so just click Next.
5. On the Specify Details page, provide the details required:
General Configuration
Security Configuration
The parameters under SmartSense Configuration are optional. Enter your SmartSense ID and opt in to SmartSense telemetry if you would like to use flex support.
6. After you've entered all required values, click Next.
7. (Optional step) On the Options page, under Advanced, you have an option to change the setting for Rollback on Failure. By default, this is set to Yes, which means that all of the AWS resources will be deleted if launching the stack fails, and you will avoid being charged for the resources. You can change the setting to No if in case of a failure, you want to keep the resources for troubleshooting purposes.
8. Click Next.
9. Finally, on the Review page, review the information provided and check I acknowledge that AWS CloudFormation might create IAM resources, and then click CREATE.
10. Refresh the CloudFormation console. You will see the status of your stack as CREATE_IN_PROGRESS. If everything goes well, after about 15 minutes the status will change to CREATE_COMPLETE, at which point you will be able to proceed to the next step. Meanwhile, let's explore AWS dashboards.
1. While your cloud controller is being launched, you can click on the Events and Resources tabs to see what AWS resources are being launched on your behalf:
2. Once the stack status changed to CREATE_COMPLETE, you can proceed to the next step. If the stack failed for some reason, refer to the Troubleshooting documentation.
1. To access the cloud controller UI, select the stack that you launched earlier, click on Outputs, and click on the CloudURL:
2. Even though your browser will tell you that the connection is unsafe, proceed to the UI and log in with the credentials that you provided in the CloudFormation template.
3. After logging in, you will see the dashboard:
Now that your cloud controller is up and running, you can create your first cluster.
1. On the dashboard, click on CREATE CLUSTER to display the form:
2. The only parameters that you are required to enter are Cluster Name, password, and confirm password. All other fields are pre-populated and you can keep the defaults. Here is a brief explanation for each of the parameters:
3. Optionally, in each of the sections you can click on SHOW ADVANCED OPTIONS to display additional options. For example:
[ { "configuration-type" : { "property-name" : "property-value", "property-name2" : "property-value" } }, { "configuration-type2" : { "property-name" : "property-value" } }]
If you are interested in learning about these options, refer to the Create Cluster documentation.
4. Click on CREATE CLUSTER. You have an option to:
5. Click on YES, CREATE A CLUSTER.
6. Now you will see a cluster tile appear on the dashboard:
7. Click on the tile to see the cluster details.
In the EVENT HISTORY log, you can see that a new stack is being launched in the CloudFormation console, then EC2 instances are started to run your cluster nodes, and an Ambari cluster is built. As you can see in the screenshot below, it took 15 minutes to build my 4-node cluster.
8. Once your cluster is ready, its status will change to RUNNING:
Congratulations! You've just created your first cluster!
Let’s explore a few shortcuts that you should be aware of when working with HDCloud.
1. Click on the icon to copy complete SSH information for a specific node:
If you are using a Mac, you can paste it into your terminal and - assuming that your private key is available on your computer - you should be able to access your cluster.
If you are using Windows and need to set up your SSH, refer to http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html.
2. Next, click on the Ambari Web link to open the Ambari Web UI in a browser:
3. Log in to Ambari web UI using the credentials that you specified when creating your cluster. Default user was `admin`, so unless you changed the default, this user should log you in.
4. Click on CLUSTER ACTIONS > Resize and resize your cluster by adding one node:
5. You can also explore the tabs where your cluster settings are available:
Click on the menu icon to see other capabilities available in the cloud controller UI:
CLUSTERS: This is where you are right now.
(Adding this on Ali Bajwa's request) See this post: https://community.hortonworks.com/articles/77290/how-to-open-additional-ports-on-ec2-security-group....
1. Once you don’t need your cluster, you can terminate it by clicking on CLUSTER ACTIONS > TERMINATE:
This will delete all the EC2 instances that were used to run cluster nodes.
2. After deleting the cluster, you can delete the cloud controller. From the CloudFormation console, delete the stack corresponding to the cloud controller:
If you try deleting the cloud controller before terminating all the clusters associated with it, you will run into errors.
To avoid unnecessary charges to your AWS account, always make sure that the stacks corresponding to the cluster and the cloud controller were successfully deleted in the CloudFormation console and that the EC2 instances running the cloud controller and cluster nodes were deleted in the EC2 console.
If you run into any issues, refer to the Troubleshooting documentation.
To learn more, refer to the HDCloud for AWS product documentation.
Related tutorials:
Let us know if this was useful and how we can help you with HDCloud for AWS in the future. Free to leave a comment below with a suggestion for an HDCloud for AWS tutorial that you would like to see. Thanks!
Created on 04-03-2017 05:53 PM
Updated for the latest HDCloud version 1.14.1. Check it out!
Created on 05-31-2017 09:19 PM
Updated for the latest HDCloud version 1.14.4. No major changes, just updated screenshots and links. Check it out!
Created on 06-15-2017 07:20 PM
Updated for the latest HDCloud version 1.16. Check it out!