Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Rising Star

Introduction

This KB article describes how to reconfigure the DataHub cluster to use a larger VM for the master node (vertical scaling) in AWS.

Considerations and Limitations

  • Ensure that you choose an instance type that has been tested by Cloudera. For a list of appropriate instance, types please consult Cloudera-supported instance type and their pricings.
  • Vertical scaling isn't fully supported in CDP. 
  • The changed Instance Type will not be reflected on the Hardware page for the Data Hub but rather show the original setting.

usman35_0-1673316377984.png

Approach

Reconfiguration of the DataHub can be done in place for existing DataHubs. For new DataHubs, we recommend that you use the instance type selector from the Advanced menu to pick the desired instance type.

The DataHub will be stopped during the reconfiguration. Expect an outage that lasts for about 4 hours.

  • Stop the DataHub cluster
  • Change Data Hub instance type
  • Change the Data Hub root volume size (optional)
  • Start the DataHub
  • Validate the DataHub

Stop the DataHub Cluster

  1. Locate the DataHub on the Management Console.
  2. Invoke the Stop DataHub action.
    usman35_1-1673316414688.png

Change Instance Type

Note. Ensure that you choose an instance type that has been tested by Cloudera. For a list of appropriate instance types please consult

 (Data Hub - AWS Instances list)

You change the instance type for the Master Node for the Data Hub.

  1. Using the Management Console for the Data Hub, select the Hardware tab and locate the instance ID for the master node.
  2. Using AWS Console→EC2→Instance, locate the instance that matches the instance ID for the master node.  
    usman35_2-1673316435255.png
  3. Select the Instance. Select Actions→Instance settings→Change instance type.usman35_3-1673316435223.png 
  4. Choose an instance type supported by Cloudera. This example doubles the instance size from m5.2xlarge to m5.4xlarge, or m5.8xlarge depending on your needs, please use the same instance, family, during the procedure.
  5. Click Apply. Check that the instance type has been changed to your desired choice.usman35_4-1673316481552.png

Change the Size of the Root Volume (optional)

The root volume size may be too small for production usage. You can resize it using the following procedure.

  1. Using the AWS Console, go to the instance ID you want to change. Make sure you're selecting the instance for the Master node.
    usman35_5-1673316481475.png
  2. Click on the Storage tab. Click on the root volume (usually the first volume attached to the VM).
    usman35_6-1673316481507.png
    usman35_7-1673316481564.png
  3. Select Action→Modify Volume to open the Modify Volume dialog.
    usman35_8-1673316481528.png
    usman35_9-1673316481542.png
  4. In the Modify Volume dialog box, select a new size for the root volume. Click the Modify button.
    usman35_10-1673316481468.png
    usman35_11-1673316481522.png
  5. Verify that the volume has been resized by clicking on the reload button at the top of the page.
    usman35_12-1673316526273.png

Start DataHub Cluster

Use the Management Console to start the DataHub cluster

usman35_13-1673316526314.png

Validate the DataHub

The simplest way to validate the changes is sshing into the VM(s) what you have changed and inspect the desired memory, CPU, and disk space with the appropriate Linux commands:

 

> free -h

              total        used        free      shared

Mem:            61G        7.3G         53G         59M        

Swap:            0B          0B          0B

> lscpu

Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                16

On-line CPU(s) list:   0-15

Thread(s) per core:    2

Core(s) per socket:    8

...

> df -h

Filesystem      Size  Used Avail Use% Mounted on

devtmpfs         31G     0   31G   0% /dev

tmpfs            31G  336K   31G   1% /dev/shm

tmpfs            31G   17M   31G   1% /run

tmpfs            31G     0   31G   0% /sys/fs/cgroup

/dev/nvme0n1p1  100G   20G   81G  20% /

/dev/nvme1n1     99G  279M   94G   1% /dbfs

/dev/nvme2n1    246G  384M  234G   1% /hadoopfs/fs1

cm_processes     31G   14M   31G   1% /run/cloudera-scm-agent/process

tmpfs           6.2G     0  6.2G   0% /run/user/1001

tmpfs           6.2G     0  6.2G   0% /run/user/0

 

449 Views