- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 11-30-2015 07:26 AM - edited 09-16-2022 01:33 AM
Automated deployment of a fresh HDP cluster that includes Zeppelin (via blueprints)
Background:
The Zeppelin Ambari service has been updated and now supports installing latest Zeppelin version (0.5.5) on HDP using blueprints to automate creation of 'data science' cluster. SequenceIQ team have a datascientist blueprint that installs Zeppelin, but based on my conversations with @lpapp, its based on an older version of the Ambari service so does not install the latest or support as many options.
Below is a writeup of how to deploy Zeppelin via blueprints. Note that if you already have a cluster running, you should just use the Add service wizard in Ambari to deploy Zeppelin using the steps on the github
- Sample steps below for installing a 4-node HDP cluster that includes Zeppelin, using Ambari blueprints and Ambari bootstrap scripts (by @Sean Roberts)
Pre-reqs:
Bring up 4 VMs imaged with RHEL/CentOS 6.5 or later (e.g. called node1-4 in this case).
Note that the VMs should not already have HDP related software installed on them at this point.
Steps:
- On non-ambari nodes (e.g. nodes2-4), use Ambari bootstrap script to run pre-reqs, install ambari-agents and point them to ambari node (e.g. node1 in this case)
export ambari_server=node1 curl -sSL https://raw.githubusercontent.com/seanorama/ambari-bootstrap/master/ambari-bootstrap.sh | sudo -E sh
- On Ambari node (e.g. node1), use bootstrap script to run pre-reqs and install ambari-server
export install_ambari_server=true curl -sSL https://raw.githubusercontent.com/seanorama/ambari-bootstrap/master/ambari-bootstrap.sh | sudo -E sh yum install -y git git clone https://github.com/hortonworks-gallery/ambari-zeppelin-service.git /var/lib/ambari-server/resources/stacks/HDP/2.3/services/ZEPPELIN
- Edit the
/var/lib/ambari-server/resources/stacks/HDP/2.3/role_command_order.json
file to include below:
"ZEPPELIN_MASTER-START": ["NAMENODE-START", "DATANODE-START"],
- Note that comma at the end. If you insert the above as the last line, you need to remove the comma
- Restart Ambari
service ambari-server restart service ambari-agent restart
- Confirm 4 agents were registered and agent remained up
curl -u admin:admin -H X-Requested-By:ambari http://localhost:8080/api/v1/hosts service ambari-agent status
- (Optional) - In general, you can generate a BP and cluster file for your cluster via Ambari recommendations API using these steps. However in this example we are providing some sample blueprints which you can edit, so this is not needed. These for reference only. For more details on the bootstrap scripts see bootstrap script git
yum install -y python-argparse git clone https://github.com/seanorama/ambari-bootstrap.git #optional - limit the services for faster deployment #for minimal services export ambari_services="HDFS MAPREDUCE2 YARN ZOOKEEPER HIVE ZEPPELIN" #for most services #export ambari_services="ACCUMULO FALCON FLUME HBASE HDFS HIVE KAFKA KNOX MAHOUT OOZIE PIG SLIDER SPARK SQOOP MAPREDUCE2 STORM TEZ YARN ZOOKEEPER ZEPPELIN" export deploy=false cd ambari-bootstrap/deploy bash ./deploy-recommended-cluster.bash cd tmpdir* #edit the blueprint to customize as needed. You can use sample blueprints provided below to see how to add the custom services. vi blueprint.json #edit cluster file if needed vi cluster.json
- Download either minimal or full blueprint for 4 node setup
#Pick one of the below blueprints #for minimal services download this one wget https://raw.githubusercontent.com/hortonworks-gallery/ambari-zeppelin-service/master/blueprint-4node... -O blueprint-zeppelin.json #for most services download this one wget https://raw.githubusercontent.com/hortonworks-gallery/ambari-zeppelin-service/master/blueprint-4node... -O blueprint-zeppelin.json
- (optional) If running on single node, download minimal blueprint for 1 node setup
#Pick one of the below blueprints #for minimal services download this one wget https://raw.githubusercontent.com/hortonworks-gallery/ambari-zeppelin-service/master/blueprint-1node... -O blueprint-zeppelin.json
- (optional) If needed, change the Zeppelin configs based on your setup by modifying these lines
vi blueprint-zeppelin.json
- if deploying on public cloud, you will want to add
"zeppelin.host.publicname":"<public IP or hostname of zeppelin node>"
so the Zeppelin Ambari view is pointing to external hostname (instead of the internal name, which is the default)
- Upload selected blueprint and download a sample cluster.json that provides your host FQDN's. Modify the host FQDN's in the cluster.json file your own env. Finally deploy cluster and call it zeppelinCluster
#upload the blueprint to Ambari curl -u admin:admin -H X-Requested-By:ambari http://localhost:8080/api/v1/blueprints/zeppelinBP -d @blueprint-zeppelin.json
- download sample cluster.json
#for 4 node setup wget https://raw.githubusercontent.com/hortonworks-gallery/ambari-zeppelin-service/master/cluster-4node.j... -O cluster.json #for single node setup wget https://raw.githubusercontent.com/hortonworks-gallery/ambari-zeppelin-service/master/cluster-1node.j... -O cluster.json
- modify the host FQDNs in the cluster json file with your own. Also change the default_password to set the password for hive
vi cluster.json
- deploy the cluster
curl -u admin:admin -H X-Requested-By:ambari http://localhost:8080/api/v1/clusters/zeppelinCluster -d @cluster.json
- You can monitor the progress of the deployment via Ambari (e.g. http://node1:8080).
- Once install completes, you will have a 4 node HDP cluster including Zeppelin, along with some starter demo Zeppelin notebooks from the gallery github
- More details available on the github README here
- Similar steps are available here to deploy a 'security ready' cluster including demo KDC, OpenLDAP, NSLCD services.
Created on 04-21-2016 12:43 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Ali Bajwa A simplified approach: On the Ambari Server:
yum -y install git git clone https://github.com/seanorama/ambari-bootstrap cd ambari-bootstrap export ambari_server_custom_script=${ambari_server_custom_script:-~/ambari-bootstrap/ambari-extras.sh} export install_ambari_server=true ./ambari-bootstrap.shThen deploy the cluster. The "extras" script above takes care of all the tedious stuff automatically (cloning Zeppelin, the blueprint defaults, the role command order, ...).
yum -y install python-argparse cd deploy export ambari_services="HDFS MAPREDUCE2 YARN ZOOKEEPER HIVE SPARK ZEPPELIN" bash ./deploy-recommended-cluster.bash