Cloudera Data Analytics (CDA) Articles

Labels (2)
avatar
Cloudera Employee

Summary

Cloudera SMEs agree that managing and balancing both HDFS and Kudu storage services on the same cluster can be complex. If you are experiencing capacity issues across your worker node layer that stems from a wide range of hardware heterogeneity, or a Kudu rebalancer that breaches 100% capacity, here are some best practices to consider.

Investigation

Rebalancing Characteristics - HDFS & Kudu

First, let’s recognize how the 2 storage services differ with respect to rebalancing.

HDFS

HDFS is a well-developed service that considers the ultimate total capacity of the worker node when it is calculating the utilization of each worker node.  It also prevents a worker node from being written to when it reaches 95%.  This means that at a worker node level:

 

  • The total utilized vs. available capacity of a worker is used during rebalancing activities
  • Above 95%, the HDFS DataNode will no longer accept any further writes until the capacity is brought below 95%.

Kudu

Kudu does not currently consider the ultimate total capacity of the worker node, nor does it prevent a worker node from being written to at any point.  This means at a worker node level :

  • The total utilized capacity of a worker is NOT used during rebalancing activities
  • A Kudu Tablet Server WILL be allowed to breach 100% capacity

The calculation for how much data is placed onto each Tablet Server during Kudu Rebalancing:

Total capacity used within Kudu / Number of Kudu Tablet Servers = Worker Node capacity utilized by Kudu

Rebalancing Characteristics - Summary

Because of the differing rebalancing characteristics, managing both storage services in the same cluster can be complex.

Extreme Example

If you have enough data within Kudu where the balancing logic needs to place 20TB of data on each tablet server, and that worker node only has 25TB of total available capacity - that worker node is already 80% full, even before HDFS tries to use the same service to store data.

 

That calls for an assessment of the Disk Configuration strategy.

Choosing HDFS & Kudu Disk Configuration

One of the first things to consider when you have HDFS & Kudu in the same cluster is whether or not you choose to stripe the data from both services across all disks. Take two examples for a 24-disk worker node within a  cluster that is running both an HDFS DataNode and a Kudu Tablet Server.

Option#1 - Assign a subset of disks to HDFS and Kudu

For example, assign:

 

  • /data/1/dfs through to /data/12/dfs
  • /data/13/kudu through to /data/24/kudu

Enforce the following characteristics:

  • The capacity is separated between the services
  • The capacity that is available to HDFS & Kudu is rigid (50/50 in this example)
  • This requires you to maintain separate processes for replacing failed disks depending on whether the disk is part of HDFS or Kudu

Option#2 - Assign all disks to HDFS and Kudu

For example, assign:

  • /data/1/dfs; /data/1/kudu
  • /data/2/dfs; /data/2/kudu
  • /data/24/dfs; /data/24/kudu

Enforce the following characteristics:

  • The capacity is mixed between the services
  • The capacity that is available to HDFS & Kudu is dynamic, where either service has to ability to consume 100% of the capacity if allowed
  • A single process can be used for replacing failed disks as every disk is part of both HDFS & Kudu

Summary HDFS & Kudu Disk Configuration

Allow both HDFS & Kudu to utilize all disks as this allows:

  • Both services can grow (or shrink) without heavy redesign of the cluster
    • It results in a full reconfiguration and restart of both HDFS and Kudu
    • If we were to assign subsets of disks to either service, then the available capacity to either service is a hard limit and needs to be reconfigured to change it.
    • Taking a decision to assign 1 more disk across the worker nodes (and 1 less) disk to either service is non-trivial.
    • Kudu is unable to rebalance between disks at the worker node level
  • Operational maintenance on failed disks can be part of a single unified process
    • This is in comparison to having separated HDFS & Kudu services at the disk level and needing to manage to separate processes to handle the variation in disk utilization

After choosing how to manage the physical disks at the hardware level, let’s now determine how we resolve the extreme example illustration that is highlighted within Rebalancing Characteristics - Summary 

HDFS - Non-DFS Used Parameter

During the exercise of reviewing what might be possible within either the Kudu or HDFS service, when considering sharing the same servers and disks, we find that:

  • Kudu currently doesn’t have any parameter or change to its current design that can make it aware of the HDFS service, nor is it currently aware of total capacity / available capacity at a Tablet Server level
  • HDFS can’t currently become ‘aware’ of Kudu either, however, the concept of ‘Reserved Space for Non DFS Use’ was identified

Non-DFS Used - Explained

The description of this parameter within Cloudera Manager - “Reserved space in bytes per volume for non-Distributed File System (DFS) use”

 

"Non-DFS Used" is calculated by the following formula:

  • Non-DFS Used = Configured Capacity - Remaining Space - DFS Used

The following formula is also true:

  • Configured Capacity = Total Disk Space - Reserved Space

So, we need to reform our formula to fully calculate “Non DFS Used:

  • Non DFS Used = (Total Disk Space - Reserved Space) - Remaining Space - DFS Used

Example

The best way to illustrate an example is to think of a single disk. The logic for the single disk then scales out into any number of variations in the configuration of disks at the worker node layer.

 

Assuming we have a 100GB disk, and:

 

  • Set the reserved space within HDFS (dfs.datanode.du.reserved) to 30 GB
  • On the disk, Kudu, the system and other files use 40 GB
  • HDFS uses 10 GB

If you run df -h, you will see the available space is 50GB for that disk volume.

In HDFS web UI, it will show:

  • Non DFS Used = (Total Disk Space - Reserved Space) - Remaining Space - DFS Used
  • Non DFS Used = (100GB - 30 GB) - 50GB - 10 GB = 10 GB

You initially configured to reserve 30GB for non dfs usage, theoretically leaving the remaining 70GB for HDFS/Kudu. However, the Non DFS Usage exceeds the theoretical 30GB reservation for Non DFS and consumes an additional 10GB space which the HDFS/Kudu are both expecting to have available.

 

It is also important to note that Non DFS Used is not a hard limit or quota, nor is it able to become one.  All the same, this parameter and the analysis that we have performed confirm that it was a viable option to develop a method/design for a mixed HDFS / Kudu cluster.

 

It’s now time to make some amendments to allow the HDFS and Kudu services to work together more harmoniously…

Resolution

Non DFS Used - Configuration

When using Kudu in a heterogeneous cluster, alongside HDFS, and sharing the disks between both services, “Reserved Space for Non DFS Use” is critical.


You must calculate the anticipated Kudu use per disk upon the ideal 100% Kudu rebalance scenario, and then, set the Non DFS parameter in a mathematical way.


The first thing is to retrieve the current use capacity of the Kudu service.  Example:

  • Kudu Used Capacity = 552TB
  • 54 Tablet Servers

The calculation to identify the anticipated Kudu use per disk after a Rebalance:

  • Amount of data per Tablet Server after rebalance = 552TBs/54 = 10.2222TB’s per TS
  • Taking some Kudu capacity growth into account, let’s use 12TB, ie
    • Non-dfs reserved per worker node to use 12TB as the baseline

Some key points:

  • Within your cluster, you may have some nodes that have 7 disks, some that have 9 disks, some that have 11, some that have 12, and some that have 24
  • Kudu doesn’t have the awareness of total disk capacity, (from the Kudu section)
    • This means we can discard this consideration whilst working out what values to set
  • You must have HDFS Role Groups setup, accurately.
  • You must have host-level Rack Awareness setup - Specifying Racks for Hosts

After confirming that the Role Groups are correctly configured, you can use the following calculation examples to determine the Non DFS used values you need for your own cluster:


  • 24 Disk Role Group with 12TB reserved
    • 12TB / 24 Disks = 0.5TB, so Non DFS used = 512GB
  • 12 Disk Role Group with 12TB reserved
    • 12TB / 12 Disks = 1TB, so Non DFS used = 1024GB
  • 11 Disk Role Group with 12TB reserved
    • 12TB / 11 Disks = 1.09TB, so Non DFS used = 1116GB
  • 9 Disk Role Group with 12TB reserved
    • 12TB / 9 Disks = 1.33TB, so Non DFS used = 1362GB
  • 7 Disk Role Group with 12TB reserved
    • 12TB / 7 Disks = 1.71TB, so Non DFS used = 1751GB.

Real World Non DFS Used Example Configuration - Screenshot

 

MichaelBush_0-1686390692970.png

Cluster Wide HDFS Capacity Example

In order to highlight the characteristics that you would see if the assigned Kudu capacity, (via the HDFS configuration - Non DFS Used), was less than the utilized capacity, please refer to the following screenshot example:

MichaelBush_1-1686390692969.png

The illustration highlights that 52.9TB Non DFS use is present in the cluster, beyond the configured settings.  We are highlighting this as this is a great way to indicate that the Non DFS Used configuration will need further tuning.  It’s an early indicator that the data in your Kudu service is growing beyond the original tuning or design.

Performing Rebalancing Activities

Once you have performed the activities within this blog to reconsider how your HDFS and Kudu services are configured, you’ll need to then Rebalance both HDFS and Kudu.

Kudu

Go to CM - Kudu - Actions - Run Kudu Rebalancer Tool:

MichaelBush_2-1686390692968.png

HDFS

Go to CM - Kudu - Actions - Rebalance

MichaelBush_3-1686390692971.png

Summary - Key Notes

Some key notes about performing the rebalancing activities after setting the services/disks up:


  • Never run both the HDFS && Kudu Rebalancers at the same time
    • The contention between both may cause issues
  • Perform the Rebalancing activities in the order of Kudu first, HDFS second
    • Due to Kudu being unable to track capacity utilization
565 Views
0 Kudos