Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Super Guru

This article describes the steps to add Spectrum Scale service to HDP cluster. We will be using an existing ESS cluster.


Pre-Requisites:

1) Download the Ambari Integration module and HDFS transparency connector. You can get it here.

2) Collect the details of the existing ESS cluster. (IPs /hostnames/ public key)

3) Install required packages

yum -y install kernel-devel cpp gcc gcc-c++ binutils ksh libstdc++ libstd++-devel compact-libstdc++ imake make nc

Note: Make sure that the kernel and kernel-devel version are the same.

4) Ensure that you setup password-less ssh from ambari-server to the ESS cluster nodes and from ESS node to the all the nodes in the cluster.

5) On the ambari server node, create a file called shared_gpfs_node.cfg under "/var/lib/ambari-server/resources/" directory and add the FQDN of any node in the ESS cluster. Make sure you add only one FQDN and password-less ssh is setup to this node from ambari server node.

Note: Add the mapping in /etc/hosts for the FQDN above

Installing the Ambari Integration Module:

1) Download and untar the Ambari Integration module in some directory on ambari server node. The directory consists of the following files

  • SpectrumScaleIntegrationPackageInstaller-2.4.2.0.bin
  • SpectrumScaleMPackInstaller.py
  • SpectrumScaleMPackUninstaller.py
  • SpectrumScale_UpgradeIntegrationPackage-BI425 (Required for IOP to HDP migration)

2) Stop all the services from Ambari. Login to Ambari -> Actions -> Stop All

3) Run the installer bin script and accept the license. It will prompt for few inputs which you have to enter.

cd <dir where you have extracted the tar>
./SpectrumScaleIntegrationPackageInstaller-2.4.2.0.bin

Once you have completed installing the Ambari Integration Module, you can proceed to Adding Spectrum Scale Service

Adding IBM Spectrum Scale Service:

1) Login to Ambari. Click Actions -> Add Service

2) On Choose Services page, select "Spectrum Scale" and Click Next

3) On Assign Masters page, select where the GPFS Master has to be installed and Click Next.

NOTE: GPFS Master has to be installed on the same node as ambari server node.

4) On Assign Slaves and Clients page, select the nodes where GPFS nodes have to be installed. On a minimum, It is recommended to install GPFS nodes on the nodes where Namenode(s) and Datanode(s) are running. Click next when you are done selecting.

5) On Customize services page,

  1. You will be prompted to enter AMBARI_USER_PASSWORD and GPFS_REPO_URL which are Ambari password and the repo directory of where the IBM Spectrum Scale rpms are located respectively.
  2. If you are using a local repository, copy the HDFS transparency package downloaded in the 1st step of pre-requisites and put it in the directories where you have other RPMs present and run 'createrepo .'
  3. Check that GPFS Cluster Name, GPFS quorum nodes,GPFS File system name are populated with the existing ESS cluster details.
  4. Change the value of "gpfs.storage.type" to "shared".
  5. Ensure that gpfs.supergroup is set to "hadoop,root".
  6. Click Next after you are done.

6) If it is a Kerberized environment, you have to Configure Identities and Click Next

7) On the Review page, check the URLs and Click Deploy.

😎 Complete the further installation process by clicking Next.

9) Restart the ambari server by running the below command on Ambari server node.

ambari-server restart

Note: Do not restart the services before restarting ambari server.

Post Installation Steps:

1) Login to Ambari and set the HDFS replication factor to 1.

HDFS -> Configs -> Advanced -> General -> Block Replication

2) Restart all the services. Actions -> Start All

3) Once all the services are up, you may see "Namenode Last Checkpoint" alert on HDFS.This is because HDFS Transparency does not do the checkpointing because IBM Spectrum Scale is stateless. So you can disable the alert.

Click on the Alert -> Disable.

Additional References:

https://developer.ibm.com/storage/2017/06/16/top-five-benefits-ibm-spectrum-scale-hortonworks-data-p...

https://www.redbooks.ibm.com/redpapers/pdfs/redp5448.pdf

https://community.hortonworks.com/content/kbentry/108565/ibm-spectrum-scale-423-certified-with-hdp-2...

Hope this helps 🙂

2,105 Views