Created on 05-14-2019 12:05 PM - edited 09-16-2022 08:39 AM
My blood recently turned from green to blue (after the Hortonworks-Cloudera merger) and I couldn't be more excited to play with new toys. What I am particularly excited about is Cloudera Data Science Workbench. But, like in everything I do, I am very lazy. So here is a quick tutorial to install Altus Director, and use it to deploy a CDH 5.15 + CDSW cluster.
Many ways to do that, but the one I chose was the AWS install, detailed here: https://www.cloudera.com/documentation/director/latest/topics/director_aws_setup_client.html
The installation documentation is very well done, but here are the important excerpts
Follow the documentation.
Few important points:
You can either search communities AMIs, or use this one: ami-6871a115
Connect to your ec2 instance:
ssh -i your_file.pem ec2-user@your_instance_ip
Install JDK and wget
sudo yum install java-1.8.0-openjdk
sudo yum install wget
Install/Start Altus server and client:
cd /etc/yum.repos.d/
sudo wget "http://archive.cloudera.com/director6/6.1/redhat7/cloudera-director.repo"
sudo yum install cloudera-director-server cloudera-director-client
sudo service cloudera-director-server start
sudo systemctl disable firewalld
sudo systemctl stop firewalld
Go to http://your_instance_ip:7189/
and connect with admin/admin
CDSW cluster configuration can be found here https://github.com/cloudera/director-scripts/blob/master/configs/aws.cdsw.conf
Modify the configuration file to use:
Go to your EC2 instance where Director is installed, and load your modified configuration file as well as the appropriate key.
Finally, run the following:
cloudera-director bootstrap-remote your_configuration_file.conf \
--lp.remote.username=admin \
--lp.remote.password=admin
You can follow the bootstrapping of the cluster both on command line or in the Director interface; once done, you can connect to Cloudera Manager using: http://your_manager_instance_ip:7180/
Cloudera Data Science Workbench uses DNS. The correct approach is to setup a wildcard DNS record is required, as described here.
However, for testing purposes I used nip.io. The only parameter to change is the Cloudera Data Science Workbench Domain, from cdsw.my-domain.com
as the conf file sets it up to, to cdsw.[YOUR_AWS_PUBLIC_IP].nip.io
, as depicted below:
Restart the CDSW service, then you should be able to access CDSW by clicking on the CDSW Web UI link. Register for a new account and you will have access to CDSW:
Created on 05-19-2019 04:55 PM
The above was originally posted in the Community Help track. On Sun May 19 16:49 UTC 2019, the HCC moderation staff moved it to the Cloud & Operations Track. The Community Help Track is intended for questions about using the HCC site itself.