Created on 10-08-2017 07:11 AM
This article describes the steps to add Spectrum Scale service to HDP cluster. We will be using an existing ESS cluster.
Pre-Requisites:
1) Download the Ambari Integration module and HDFS transparency connector. You can get it here.
2) Collect the details of the existing ESS cluster. (IPs /hostnames/ public key)
3) Install required packages
yum -y install kernel-devel cpp gcc gcc-c++ binutils ksh libstdc++ libstd++-devel compact-libstdc++ imake make nc
Note: Make sure that the kernel and kernel-devel version are the same.
4) Ensure that you setup password-less ssh from ambari-server to the ESS cluster nodes and from ESS node to the all the nodes in the cluster.
5) On the ambari server node, create a file called shared_gpfs_node.cfg under "/var/lib/ambari-server/resources/" directory and add the FQDN of any node in the ESS cluster. Make sure you add only one FQDN and password-less ssh is setup to this node from ambari server node.
Note: Add the mapping in /etc/hosts for the FQDN above
Installing the Ambari Integration Module:
1) Download and untar the Ambari Integration module in some directory on ambari server node. The directory consists of the following files
2) Stop all the services from Ambari. Login to Ambari -> Actions -> Stop All
3) Run the installer bin script and accept the license. It will prompt for few inputs which you have to enter.
cd <dir where you have extracted the tar> ./SpectrumScaleIntegrationPackageInstaller-2.4.2.0.bin
Once you have completed installing the Ambari Integration Module, you can proceed to Adding Spectrum Scale Service
Adding IBM Spectrum Scale Service:
1) Login to Ambari. Click Actions -> Add Service
2) On Choose Services page, select "Spectrum Scale" and Click Next
3) On Assign Masters page, select where the GPFS Master has to be installed and Click Next.
NOTE: GPFS Master has to be installed on the same node as ambari server node.
4) On Assign Slaves and Clients page, select the nodes where GPFS nodes have to be installed. On a minimum, It is recommended to install GPFS nodes on the nodes where Namenode(s) and Datanode(s) are running. Click next when you are done selecting.
5) On Customize services page,
6) If it is a Kerberized environment, you have to Configure Identities and Click Next
7) On the Review page, check the URLs and Click Deploy.
😎 Complete the further installation process by clicking Next.
9) Restart the ambari server by running the below command on Ambari server node.
ambari-server restart
Note: Do not restart the services before restarting ambari server.
Post Installation Steps:
1) Login to Ambari and set the HDFS replication factor to 1.
HDFS -> Configs -> Advanced -> General -> Block Replication
2) Restart all the services. Actions -> Start All
3) Once all the services are up, you may see "Namenode Last Checkpoint" alert on HDFS.This is because HDFS Transparency does not do the checkpointing because IBM Spectrum Scale is stateless. So you can disable the alert.
Click on the Alert -> Disable.
Additional References:
https://www.redbooks.ibm.com/redpapers/pdfs/redp5448.pdf
Hope this helps 🙂