Created on 03-04-2017 06:05 PM - edited 09-16-2022 01:38 AM
This tutorial will walk you through the process of using Ansible, an agent-less automation tool, to create instances on AWS. The Ansible playbook we will use is relatively simple; you can use it as a base to experiment with more advanced features. You can read more about Ansible here: Ansible.
Ansible is written in Python and is installed as a Python module on the control host. The only requirement for the hosts managed by Ansible is the ability to login with SSH. There is no requirement to install any software on the host managed by Ansible.
If you have never used Ansible, you can become more familiar with it by going through some basic tutorials. The following two tutorials are a good starting point:
This tutorial is part 1 of a 2 part series. Part 2 in the series will show you how to use Ansible to deploy Hortonworks Data Platform (HDP) on Amazon Web Services (AWS).
This tutorial was created as a companion to the Ansible + Hadoop talk I gave at the Ansible NOVA Meetup in February 2017. You can find the slides to that talk here: SlideShare
You can get a copy of the playbook from this tutorial here: Github
This tutorial was tested using the following environment and components:
You need to create a directory for your Ansible playbook. I prefer to create my project directories in ~/Development.
mkdir ~/Development/ansible-aws cd ~/Development/ansible-aws
If you use the Anaconda version of Python, you already have access to Ansible. If you are not using Anaconda, then you can usually install Ansible using the following command:
pip install ansible
To read more about how to install Ansible: Ansible Installation
Our playbook is relatively simple. It consists of a single inventory file, single group_vars file and a single playbook file. Here is the layout of the file and directory structure:
+- ansible-aws/ | +- group_vars/ | +- all | +- inventory/ | +- hosts | +- playbooks/ | +- ansible-aws.yml
You can use variables in your playbooks using the {{variable name}}
syntax. These variables are populated based on values stored in your variable files. You can explicitly load variable files in your playbooks.
However, all playbooks will automatically load the variables in the group_vars/all
variable file. The all
variable file is loaded for all hosts regardless of the groups the host may be in. In our playbook, we are placing our AWS configuration values in the all
file.
Edit the group_vars/all
file. Copy and paste the following text into the file:
aws_access_key: <enter AWS access key> aws_secret_key: <enter AWS secret key> key_name: <enter private key file alias name> aws_region: <enter AWS region> vpc_id: <enter VPC ID> ami_id: ami-6d1c2007 instance_type: m4.2xlarge my_local_cidr_ip: <enter cidr_ip>
aws_access_key
: You need to enter your AWS Access keyaws_secret_key
: You need to enter your AWS Secret keykey_name
: The alias name you gave to the AWS private key which you will use to SSH into the instances. In my case I created a key called ansible
.aws_region
: The AWS region where you want to deploy your instances. In my case I am using us-east-1
.vpc_id
: The specific VPC in which you want to place your instances.ami_id
: The specific AMI you want to deploy for your instances. The ami-6d1c2007
AMI is a CentOS 7 image.instance_type
: The type of AWS instance. For deploying Hadoop, I recommend at least m4.2xlarge
. A faster alternative is c4.4xlarge
.my_local_cidr_ip
: Your local computer's CIDR IP address. This is used for creating the security rules that allow your local computer to access the instances. An example CIDR format is 192.168.1.1/32
. Make sure this set to your computer's public IP address.After you have entered your appropriate settings, save the file.
Ansible requires a list of known hosts against which playbooks and tasks are run. We will tell Ansible to use a specific host file with the -i inventory/hosts
parameter.
Edit the inventory/hosts
file. Copy and paste the following text into the file:
[local] localhost ansible_python_interpreter=/Users/myoung/anaconda/bin/python
[local]
: Defines the group the host belongs to. You have the option for a playbook to run against all hosts, a specific group of hosts, or an individual host. This AWS playbook only runs on your local computer. That is because it uses the AWS APIs to communicate with AWS.localhost
: This is the hostname. You can list multiple hosts, 1 per line under each group heading. A host can belong to multiple groups.ansible_python_interpreter
: Optional entry that tells Ansible which specific version of Python to run. Because I am using Anaconda Python, I've included that setting here.After you have entered your appropriate settings, save the file.
The playbook is where we define the list of tasks we want to perform. Our playbook will consist of 2 tasks. The first task is to create a specific AWS Security Group. The second tasks is to create a specific configuration of 6 instances on AWS.
Edit the file playbooks/ansible-aws.yml
. Copy and paste the following text into the file:
--- # Basic provisioning example - name: Create AWS resources hosts: localhost connection: local gather_facts: False tasks: - name: Create a security group ec2_group: name: ansible description: "Ansible Security Group" region: "{{aws_region}}" vpc_id: "{{vpc_id}}"" aws_access_key: "{{aws_access_key}}" aws_secret_key: "{{aws_secret_key}}" rules: - proto: all cidr_ip: "{{my_local_cidr_ip}}" - proto: all group_name: ansible rules_egress: - proto: all cidr_ip: 0.0.0.0/0 register: firewall - name: Create an EC2 instance ec2: aws_access_key: "{{aws_access_key}}" aws_secret_key: "{{aws_secret_key}}" key_name: "{{key_name}}" region: "{{aws_region}}" group_id: "{{firewall.group_id}}" instance_type: "{{instance_type}}" image: "{{ami_id}}" wait: yes volumes: - device_name: /dev/sda1 volume_type: gp2 volume_size: 100 delete_on_termination: true exact_count: 6 count_tag: Name: aws-demo instance_tags: Name: aws-demo register: ec2
This playbook uses the Ansible ec2 and ec2_group modules. You can read more about the options available to those modules here:
The task to create the EC2 security group creates a group named ansible
. It defines 2 ingress rules and 1 egress rule for that security group. The first ingress rule is to allow all inbound traffic from any host in the security group ansible
. The second ingress rule is to allow all inbound traffic from your local computer IP address. The egress rule allows all traffic out from all of the hosts.
The task to create the EC2 instances creates 6
hosts because of the exact_count
setting. It creates a tag called hadoop-demo
on each of the instances and uses that tag to determine how many hosts exists. You can chose to use smaller number of hosts.
You can specify volumes to mount on each of the instances. The default volume size is 8
GB and is too small for deploying Hadoop later. I recommend setting the size to at least 100
GB as above. I also recommend you set delete_on_termination
to true
. This will tell AWS to delete the storage after you have deleted the instances. If you do not do this, then storage will be kept and you will be charged for it.
After you have entered your appropriate settings, save the file.
Now that our 3 files have been created and saved with the appropriate settings, we can run the playbook. To run the playbook, you use the ansible-playbook -i inventory/hosts playbooks/ansible-aws.yml
command. You should see something similar to the following:
$ ansible-playbook -i inventory/hosts playbooks/ansible-aws.yml PLAY [Create AWS resources] **************************************************** TASK [Create a security group] ************************************************* changed: [localhost] TASK [Create an EC2 instance] ************************************************** changed: [localhost] PLAY RECAP ********************************************************************* localhost : ok=2 changed=2 unreachable=0 failed=0
The changed
lines indicate that Ansible found a configuration that needed to be modify to be consistent with our requested state. For the security group task, you would see this if your security group didn't exist or if you had a different set of ingress or egress rules. For the instance tasks, you would see this if there were less than or more than 6 hosts tagged as aws-demo
.
If you check your AWS console, you should be able to confirm the instances are created. You should see something similar to the following:
If you successfully followed along with this tutorial, you have created a simple Ansible playbook with 2 tasks using the ec2 and ec2_group Ansible modules. The playbook creates an AWS security group and instances which can be used later for deploying HDP on AWS.