Created on 06-15-2016 01:21 AM - edited 08-17-2019 12:03 PM
Often times, maintaining a hadoop cluster for longer periods on cloud is an expensive task. Sometimes engineers might also encounter situations of not having immediate access to cloud environment, in order to quickly spin up their own cluster and play arround. As an easy alternative, vagrant with virtual box as a provider, HDP cluster can be set up on your own laptop.
Create a working directory for Vagrant file generation and initiating the deployment vis vagrant
$ mkdir hdp22 $ cd hdp22
Following command will generate the Vagrantfile in current directory. This file will define VMs that are to be on the cluster.
$ vagrant init $ vi Vagrantfile
Lets configure Vagrant to use CentOS 6.6/ CentOS 6.7 as the base box
$ config.vm.box = "bento/centos-6.7" or $ config.vm.box = "chef/centos-6.6"
The below script should be included in the Vagrantfile to allow some basic provisioning for VMs like
1. Install NTP service
2. Disable firewall, SElinux
3.(Optional) Install wget
$script = <<SCRIPT sudo yum -y install ntp sudo chkconfig ntpd on sudo chkconfig iptables off sudo /etc/init.d/iptables stop sudo setenforce 0 sudo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config sudo sh -c 'echo "* soft nofile 10000" >> /etc/security/limits.conf' sudo sh -c 'echo "* hard nofile 10000" >> /etc/security/limits.conf' sudo sh -c 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag' sudo sh -c 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' SCRIPT config.vm.provision "shell", inline: $script
The following configurations will define 4 Virtual machines to be used in the HDP cluster,
1 Ambari server
1 Hadoop master
2 slaves
The machines defined have below hostnames:
1. ambari1.mycluster
2. master1.mycluster
3. slave1.mycluster
4. slave2.mycluster
# Ambari1 config.vm.define :ambari1 do |a1| a1.vm.hostname = "ambari1.mycluster" a1.vm.network :private_network, ip: "192.168.0.11" a1.vm.provider :virtualbox do |vb| vb.memory = "2048" end a1.vm.network "forwarded_port", guest: 8080, host: 8080 a1.vm.network "forwarded_port", guest: 80, host: 80 end # Master1 config.vm.define :master1 do |m1| m1.vm.hostname = "master1.mycluster" m1.vm.network :private_network, ip: "192.168.0.12" m1.vm.provider :virtualbox do |vb| vb.memory = "4096" end end # Slave1 config.vm.define :slave1 do |s1| s1.vm.hostname = "slave1.mycluster" s1.vm.network :private_network, ip: "192.168.0.21" s1.vm.provider :virtualbox do |vb| vb.memory = "2048" end end # Slave2 config.vm.define :slave2 do |s2| s2.vm.hostname = "slave2.mycluster" s2.vm.network :private_network, ip: "192.168.0.22" s2.vm.provider :virtualbox do |vb| vb.memory = "2048" end end
Vagrant will automatically run the provision defined in Vagrantfile by Shell Provisioner to start the Ambari server machine from Vagrant. And then SSH to the Ambari server
$ vagrant up ambari1 $ vagrant ssh ambari1
As a root user, run the below commands
# Install wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo yum -y install ambari-server sudo service ntpd start # Setup. There are several options to configure during setup. ambari-server setup # Start Ambari Server ambari-server start
Add the following FQDN to each the /etc/hosts
file on each VM.
192.168.0.11 ambari1.mycluster ambari1 192.168.0.12 master1.mycluster master1 192.168.0.21 slave1.mycluster slave1 192.168.0.22 slave2.mycluster slave2
Set up a password less SSHing from Ambari Node to all other nodes(VMs)
$ ssh-keygen $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Copy the ambari server's public key to other nodes' authorized keys to allow communication later.
We are ready to deploy a HDP cluster from Ambari Web UI. Because the UI is really simple, I would omit the screenshots here.
vagrant
.Now we are all set and a 4 nodes HDP cluster ready on your local machine!
Created on 06-17-2016 03:15 PM
Step 5 installs Ambari 1.7 which is an older version. You should use this step to get the latest version (Ambari 2.2.20):
wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo
Created on 06-17-2016 03:29 PM
In step 3, the script in the VagrantFile could include:
sudo service ntpd start
The chkconfig command will ensure ntpd starts on bootup. However, I found ntpd did not auto start the first time the instance was brought up with Vagrant. Subsequent boots of the VM work properly.
Created on 06-20-2016 08:52 PM
Thanks for letting me know this.