Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Master Guru

In previous post we have seen how to Automate HDP installation with Kerberos authencation on single node cluster using Ambari Blueprints.

In this post, we will see how to Deploy multinode node HDP Cluster with Kerberos authentication via Ambari blueprint.

.

Note - For Ambari 2.6.X onwards, we will have to register VDF to register internal repository, or else Ambari will pick up latest version of HDP and use the public repos. please see below document for more information. For Ambari version less than 2.6.X, this guide will work without any modifications.

Document - https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-release-notes/content/ambari_reln...

.

Below are simple steps to install HDP multi node cluster with Kerberos Authentication(MIT KDC) using internal repository via Ambari Blueprints.

.

Step 1: Install Ambari server using steps mentioned under below link

http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-installation/content/ch_Installing...

.

Step 2: Register ambari-agent manually

Install ambari-agent package on all the nodes in the cluster and modify hostname to ambari server host(fqdn) in /etc/ambari-agent/conf/ambari-agent.ini

.

Step 3: Install and configure MIT KDC

Detailed Steps(Demo on HDP Sandbox 2.4):

3.1 Clone our github repository Ambari server in your HDP Cluster

Note - This script will install and configure KDC on your Ambari Server.

git clone https://github.com/crazyadmins/useful-scripts.git

Sample Output:

[root@sandbox ~]# git clone https://github.com/crazyadmins/useful-scripts.git
Initialized empty Git repository in /root/useful-scripts/.git/
remote: Counting objects: 29, done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 29 (delta 4), reused 25 (delta 3), pack-reused 0
Unpacking objects: 100% (29/29), done.

2. Goto useful-scripts/ambari directory

[root@sandbox ~]# cd useful-scripts/ambari/
[root@sandbox ambari]# ls -lrt
total 16
-rw-r--r-- 1 root root 5701 2016-04-23 20:33 setup_kerberos.sh
-rw-r--r-- 1 root root 748 2016-04-23 20:33 README
-rw-r--r-- 1 root root 366 2016-04-23 20:33 ambari.props
[root@sandbox ambari]#

3. Copy setup_only_kdc.sh and ambari.props to the host where you want to setup KDC Server

4. Edit and modify ambari.props file according to your cluster environment

Note - In case of multinode cluster, Please don't forget to add comma separated list of hosts as a value of KERBEROS_CLIENTS variable(Not applicable for this post).

Sample output for my Sandbox

[root@sandbox ambari]# cat ambari.props
CLUSTER_NAME=Sandbox --> You can skip this variable
AMBARI_ADMIN_USER=admin  --> Not required
AMBARI_ADMIN_PASSWORD=admin --> Note required
AMBARI_HOST=sandbox.hortonworks.com --> Required
KDC_HOST=sandbox.hortonworks.com --> Required
REALM=HWX.COM --> Required
KERBEROS_CLIENTS=sandbox.hortonworks.com --> Not required
##### Notes #####
#1. KERBEROS_CLIENTS - Comma separated list of Kerberos clients in case of multinode cluster
#2. Admin princial is admin/admin and password is hadoop
[root@sandbox ambari]#

5. Start installation by simply executing setup_only_kdc.sh

Notes:

1. Please run setup_only_kdc.sh from KDC_HOST only, you don’t need to setup or configure KDC, this script will do everything for you.

.

Step 4: Configure blueprints

Please follow below steps to create Blueprints

.

4.1 Create hostmap.json(cluster creation template) file as shown below:

Note – This file will have information related to all the hosts which are part of your HDP cluster.

{
 "blueprint": "hdptest",
 "default_password": "hadoop",
 "host_groups": [{
  "name": "kerbnode1",
  "hosts": [{
   "fqdn": "kerbnode1.openstacklocal"
  }]
 }, {
  "name": "kerbnode2",
  "hosts": [{
   "fqdn": "kerbnode2.openstacklocal"
  }]
 }, {
  "name": "kerbnode3",
  "hosts": [{
   "fqdn": "kerbnode3.openstacklocal"
  }]
 }],
 "credentials": [{
  "alias": "kdc.admin.credential",
  "principal": "admin/admin",
  "key": "hadoop",
  "type": "TEMPORARY"
 }],
 "security": {
  "type": "KERBEROS"
 },
 "Clusters": {
  "cluster_name": "kerberosCluster"
 }
}

4.2 Create cluster_config.json(blueprint) file, it contents mapping of hosts to HDP components

{
 "configurations": [{
  "kerberos-env": {
   "properties_attributes": {},
   "properties": {
    "realm": "HWX.COM",
    "kdc_type": "mit-kdc",
    "kdc_host": "kerbnode1.openstacklocal",
    "admin_server_host": "kerbnode1.openstacklocal"
   }
  }
 }, {
  "krb5-conf": {
   "properties_attributes": {},
   "properties": {
    "domains": "HWX.COM",
    "manage_krb5_conf": "true"
   }
  }
 }],
 "host_groups": [{
  "name": "kerbnode1",
  "components": [{
   "name": "NAMENODE"
  }, {
   "name": "NODEMANAGER"
  }, {
   "name": "DATANODE"
  }, {
   "name": "ZOOKEEPER_CLIENT"
  }, {
   "name": "HDFS_CLIENT"
  }, {
   "name": "YARN_CLIENT"
  }, {
   "name": "MAPREDUCE2_CLIENT"
  }, {
   "name": "ZOOKEEPER_SERVER"
  }],
  "cardinality": 1
 }, {
  "name": "kerbnode2",
  "components": [{
   "name": "SECONDARY_NAMENODE"
  }, {
   "name": "NODEMANAGER"
  }, {
   "name": "DATANODE"
  }, {
   "name": "ZOOKEEPER_CLIENT"
  }, {
   "name": "ZOOKEEPER_SERVER"
  }, {
   "name": "HDFS_CLIENT"
  }, {
   "name": "YARN_CLIENT"
  }, {
   "name": "MAPREDUCE2_CLIENT"
  }],
  "cardinality": 1
 }, {
  "name": "kerbnode3",
  "components": [{
   "name": "RESOURCEMANAGER"
  }, {
   "name": "APP_TIMELINE_SERVER"
  }, {
   "name": "HISTORYSERVER"
  }, {
   "name": "NODEMANAGER"
  }, {
   "name": "DATANODE"
  }, {
   "name": "ZOOKEEPER_CLIENT"
  }, {
   "name": "ZOOKEEPER_SERVER"
  }, {
   "name": "HDFS_CLIENT"
  }, {
   "name": "YARN_CLIENT"
  }, {
   "name": "MAPREDUCE2_CLIENT"
  }],
  "cardinality": 1
 }],
 "Blueprints": {
  "blueprint_name": "hdptest",
  "stack_name": "HDP",
  "stack_version": "2.5",
  "security": {
   "type": "KERBEROS"
  }


 }
}

.

Step 5: Create an internal repository map

.

5.1: hdp repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in repo.json file.

{
"Repositories" : {
   "base_url" : "http://172.26.64.249/hdp/centos6/HDP-2.5.3.0/",
   "verify_base_url" : true
}
}

.

5.2: hdp-utils repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in hdputils-repo.json file.

{
"Repositories" : {
   "base_url" : "http://172.26.64.249/hdp/centos6/HDP-UTILS-1.1.0.20/",
   "verify_base_url" : true
}
}

.

Step 6: Register blueprint with Ambari server by executing below command

curl -H X-Requested-By: ambari -X POST -u admin:admin http://<ambari-server>:8080/api/v1/blueprints/hdptest -d @cluster_config.json

Step 7: Setup Internal repo via REST API.

Execute below curl calls to setup internal repositories.

curl -H "X-Requested-By: ambari"-X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.5/operating_systems/redhat6/reposi... -d @repo.json
curl -H "X-Requested-By: ambari"-X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.5/operating_systems/redhat6/reposi... -d @hdputils-repo.json

.

Step 8: Pull the trigger! Below command will start cluster installation.

curl -H "X-Requested-By: ambari" -X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/clusters/hdptest -d @hostmap.json

.

You should see that Ambari has already marked Kerberos as enabled and started installing required services:

11568-screen-shot-2017-01-21-at-120926-am.png

.

Please feel free to comment if you need any further help on this. Happy Hadooping!! :)

8,255 Views
Comments

@Kuldeep Kulkarni..

You might want to fix the file names referenced through out the article.

From https://cwiki.apache.org/confluence/display/AMBARI/Blueprints, the hostmapping.json is known as the "cluster creation template" and cluster_configuration.json is known as the "blueprint". This may be confusing to anyone who is familiar with that Wiki article.

That said, your file names are ok but they are inconsistent. To register the Blueprint, you reference "cluster_config.json" but the example document you declared was named "cluster_configuration.json". Also, to start the cluster creation process, you reference "hostmap.json" where the example document was named "hostmapping.json". This may lead to some confusion.

On a different topic, there is no mention of how to update the stack-default Kerberos descriptor. This can be done within the cluster creation template (or host mapping document, hostmap.json/hostmapping.json) within the "security" section. The child to "security" should be named "kerberos_descriptor" and may be a sparse descriptor declaring only the changes to apply to the stack default. However, the entire Kerberos descriptor may be set there as well.

Finally, when setting the KDC administrator credentials, the persisted credential store may be specified as well. The example only shows the temporary storage facility; however, if Ambari's credential store is configured, you can specify "persisted" as the credential "type" to have the specified credential stored there. The difference being that the temporary store holds on to the KDC administrator credential for 90 minutes or until Ambari is restarted and the persisted store holds on to the credential until manually removed. If 90 minutes is not accepted for the temporary storage retention time, the user can set a more desirable retention time by setting the "security.temporary.keystore.retention.minutes" property in the ambari.properties file.

Other than that, nice article! We need something like this to show that creating a Kerberized cluster via Blueprints is a viable option.

avatar
Master Guru

@Robert Levas - Thank you so much for the valuable feedback! 🙂 I will make the necessary changes as soon as possible.

Thanks again.

avatar
Master Guru

@Robert Levas - Thank you so much for the valuable feedback! 🙂 I will make the necessary changes as soon as possible.

Thanks again.

Is it possible to export a blueprint of the current cluster configuration?

avatar
Master Guru

@Georg Heiler - Yes. Please use refer below curl command for the same

curl -H "X-Requested-By: ambari" -X GET-u <admin-user>:<admin-password> http://<ambari-server>:8080/api/v1/clusters/<cluster-name>?format=blueprint