About ragar

ragar · ‎05-02-2024

Introduction Cloudera Operational Database (COD) is a service that runs on the Cloudera Data Platform (CDP). COD enables you to create a new operational database that automatically scales based on your workload. To deploy high-performance applications at scale, a rugged operational database is essential. COD is a high-performance and highly scalable operational database designed for powering, at any scale, the biggest data applications on the planet. Powered by Apache HBase and Apache Phoenix, COD ships out of the box with Cloudera Data Platform (CDP) in the public cloud. It’s also ready for hybrid and multi-cloud deployments to meet your business where it is today, whether AWS, Microsoft Azure, or GCP. Support for cloud storage is an important capability of COD that, in addition to the pre-existing support for HDFS on local storage, offers customers a choice of price-performance characteristics. Please refer to the blog for more information on the performance differences between COD on HDFS and COD on cloud storage with ephemeral cache (Amazon AWS and Microsoft Azure). To understand how COD delivers the best cost-efficient performance for your applications, let’s dive into benchmarking results comparing COD using different cloud storages. Methodology The tests were performed on a data set created using the Yahoo! Cloud Serving Benchmark (YCSB) test framework on AWS. YCSB is an open-source benchmarking suite for performance evaluations. It is frequently used to measure the performance of multi-node database systems on the public cloud and other distributed infrastructure. In this performance evaluation, a large dataset of 20TB was generated and backed up to an S3 bucket for further use. The same data was in turn exported to run the performance tests on Azure and GCP for fair comparison. This article measures the performance differences between Amazon AWS, Microsoft Azure, and Google GCP with ephemeral cache. It does not evaluate the performance of cloud storage, local disks, and block storage independently. Dataset The details of the dataset used for these performance tests are as follows: Data size: 20TB Number of rows in the table: 20 bn Environment AWS No. of master nodes: 2 (m5.2xlarge) No. of leader nodes: 1 (m5.2xlarge) No. of gateway nodes: 1 (m5.2xlarge) No. of worker nodes: 20 (i3.2xlarge) (Storage as S3) Azure No. of master nodes: 2 (Standard_D8a_V4) No. of leader nodes: 1 (Standard_D8a_V4) No. of gateway node: 1 (Standard_D8a_V4) No. of worker nodes: 20 (Standard_L8s_V2) (Storage as ABFS) GCP No. of master nodes: 2 (n2-standard-8) No. of leader nodes: 1 (n2-standard-8) No. of gateway nodes: 1 (e2-standard-8) No. of worker nodes: 20 (n2-standard-16) (Storage as GCS) YCSB details The tests were run using the YCSB tool. The details are given below: Performance benchmarking was done using the following YCSB workloads YCSB Workload A Update heavy workload 50% read, 50% write YCSB Workload C 100% read YCSB Workload F Read-Modify-Update workload 50% read, 25% update, 25% read-modify-update The following parameters were used to run the workloads using YCSB: Each workload was run for 15 min (900 secs) in the following order: Workload C - is a warm-up run to warm up the cache for the subsequent workload runs. Workload A Workload C Workload F Sample set for running the workloads 1 billion rows 100 million batch Results The charts below show the comparison between AWS, Azure, and GCP with 100% ephemeral cache warm-up. This ensures that most of the blocks are in the cache. The charts below show the time taken to warm up the cache on COD on Amazon AWS and COD on GCP. It has been observed that COD on AWS takes 2x time to warm up the cache as compared to the warm up time required in GCP. Cache warmup on COD on AWS Cache warmup on COD on GCP The following chart shows the comparison between some key performance indicators on AWS, Azure, and GCP cloud platforms: Performance comparison of COD with Ephemeral cache on Amazon AWS vs. Microsoft Azure vs. GCP The following chart shows the average throughput observed while running the YCSB tests. It has been observed that the average throughput of HBase running on Google GCS is better than the throughput observed on HBase with Amazon AWS and Microsoft Azure in different types of workloads. Hence, HBase with Google GCS gives a better overall performance over other cloud providers. Average throughput comparison between Amazon S3 vs. Microsoft ABFS vs. Google GCS The following chart shows the latency observed while running the workloads involving reads. The results show that HBase with Google GCS has better latencies as compared to Amazon AWS and Microsoft Azure in the case of read-only workload viz. workload-c, while they are comparable in a mixed workload like workload-a. Read latency comparison between Amazon S3 vs. Microsoft ABFS vs. Google GCS The following chart shows the latency observed while running workloads involving writes. The results show that the write latency observed while running HBase with Google GCS is better than the HBase with Amazon AWS and Microsoft Azure by a large margin. Write latency comparison between Amazon S3 vs. Microsoft ABFS vs. Google GCS Summary The above comparison shows that GCP with GCS is found to be performing better as compared to Amazon AWS and Microsoft Azure with better overall throughput and better read/write latencies while running the workloads. The write latencies for GCP with GCS were found to be way better than the other two platforms, which is owing to the performance of the block storage in GCS. References A similar performance experiment was performed to compare the performance of COD running on HDFS vs. COD running on cloud storage provided by Amazon AWS and Microsoft Azure. The details of these experiments can be found in the blog titled Cloudera Operational Database (COD) Performance Benchmarking: Comparing HDFS and Cloud Storage. A detailed description of how to run YCSB for HBase can be found in the blog titled How to run YCSB for HBase. Visit the product page to learn more about the Cloudera Operational Database or reach out to your account team.

ragar · ‎05-02-2024

Introduction Apache HBase provides different cache capabilities. In Cloudera Operational Database (COD) deployments using cloud storage, we leverage the file-based BucketCache implementation to deploy it over ephemeral SSD volumes to cache the whole user dataset to avoid client reads from having to reach the slower cloud storage layer. However, the file-based BucketCache implementation was originally volatile, meaning the cache was not retained upon region server restarts. This requires a warm-up period every time a region server is restarted, to maintain optimal read performance. Depending on the data size and the configured cache size, this warm-up can take anywhere from a few minutes to a few hours. To eliminate this, the Cloudera team and the HBase community implemented the BucketCache persistence feature (HBASE-27264/HBASE-27486), where the region servers periodically persist the blocks cached in the bucket cache. This persisted information is then used to recover the cache in the event of a region server restart (either normal restart or crash). Caching is critical for the performance of COD clusters using cloud storage, but making sure client request load is evenly distributed among region servers is equally important, so we needed to make the built-in HBase Balancer "aware" of the cache usage when deciding to move regions around. In the current HBase's default balancer, the Stochastic Load Balancer, the caching state is not considered. This can cause regions already "fully" cached on a region server to be moved to another region server. The cached data for this region now has to be thrown away and it has to be cached afresh on the newly assigned region server. Meet the new CacheAwareLoadBalancer (HBASE-27389), which is designed to consider the cache allocation of each region on region servers when calculating a new assignment plan and use the region/region server cache allocation information reported by region servers to calculate the percentage of data cached for each region on the region server, and then use that as a factor when deciding on an optimal, new assignment plan. HBase master captures the cache information from all the region servers and uses this information to decide the region assignments while ensuring a minimal impact on the warmed-up cache. A region is assigned to the region server where it has a better cache ratio as compared to the region server where it is currently hosted. Implementation The CacheAwareLoadBalancer uses two cost elements for deciding the region allocation. These are described below: Cache Cost The cache cost is calculated as the percentage of data for a region cached on the region server where it is either currently hosted or was previously hosted. A region may have multiple HFiles, each of different sizes. An HFile is considered to be fully cached when all the data blocks in this file are in the cache. The region server hosting this region calculates the ratio of the size of HFiles cached in the bucket cache to the total size of HFiles in the region. This ratio will vary from 0 (region hosted on this server, but none of its HFiles are cached into the bucket cache) to 1 (region hosted on this server and all the HFiles for this region are cached into the bucket cache). Every region server maintains this information for all the regions currently hosted there. In addition to that, this cache ratio is also maintained for the regions that were previously hosted on this region server giving historical information about the regions as long as the blocks weren’t yet evicted. Skewness Cost The skewness cost is calculated as the number of regions hosted on each region server in the cluster. The skewness cost varies from 0 (regions are equally distributed across the region servers) to 1 (regions are not equally distributed across the region servers). The balancer considers these two costs and calculates the resulting cost of maintaining the distribution of the current regions in the cluster. The balancer will attempt to rebalance the cluster under the following conditions: There is an idle server in the cluster. This can happen when an existing server is restarted or a new server is added to the cluster. When the cost of maintaining the balance in the cluster is greater than the minimum threshold defined by the configuration “hbase.master.balancer.stochastic.minCostNeedBalance”. Configuration The cluster can be made to use the CacheAwareLoadBalancer by setting the following configuration properties: hbase.master.loadbalancer.class Defines the load balancer class to be used in the cluster. The default load balancer used by the cluster is StochasticLoadBalancer. The following configuration parameter needs to be set for the cluster to use the CacheAwareLoadBalancer. <property> <name>hbase.master.loadbalancer.class</name> <value>org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer</value> </property> hbase.bucketcache.persistent.path This configuration defines the location of the file where the region servers will persist the cache index information. If this configuration is set, the region servers periodically write the cache index into the given file in the local path specified. While restarting the region server, this information is reinstated by the region server. The CacheAwareLoadBalancer relies on this information to decide on the region assignment. The CacheAwareLoadBalancer will not work in the absence of this configuration. <property> <name>hbase.bucketcache.persistent.path</name> <value>/path/to/cache-index-file</value> </property> Implications The CacheAwareLoadBalancer attempts to calculate the region assignment plan considering the number of regions already in the cache, reducing the need for re-caching the regions. This results in reduced IO to the underlying cloud storage for reading the blocks during region movement. To achieve this, the balancer attempts to assign the region to a host where it has more data in the cache. We conducted a few experiments to monitor the region movement and its impact on the cache with this load balancer and compared them with the default StochasticLoadBalancer. The results of these experiments are summarized below. Configuration Number of worker nodes: 10 Java heap size per node: 31GB Configured bucket cache size per node: 1.6 TB Number of regions in the table: 20 Dataset size: 100GB Table schema: TestTable, {TABLE_ATTRIBUTES => {METADATA => {'hbase.store.file-tracker.impl' => 'FILE'}}} COLUMN FAMILIES DESCRIPTION {NAME => 'info0', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536', METADATA => {'IN_MEMORY_COMPACTION' => 'NONE'}} Use cases We conducted the following experiments to compare the effect on cache using CacheAwareLoadBalancer and StochasticLoadBalancer. The following sections will provide a comparative analysis of the impact on cache performance when the cluster is using the CacheAwareLoadBalancer and when the cluster is using the default load balancer viz. StochasticLoadBalancer. In all these experiments, the cache was 100% warmed up before starting the experiments. Single Region Server restart In this experiment, a single region server in the cluster was restarted to observe how it impacts the cache when the region server restarts and the balancer runs. The following charts show the impact on the cache while using the StochasticLoadBalancer vs CacheAwareLoadBalancer. The first chart shows the impact on the cache while using the StochasticLoadBalancer while the second shows the impact on the cache while using the CacheAwareLoadBalancer. Impact on cache when a single region server is restarted with StochasticLoadBalancer Impact on cache when a single region server is restarted with CacheAwareLoadBalamcer The results clearly show that there is minimal impact on the cache when a single region server is restarted. The CacheAwareLoadBalancer also attempts to reassign the region back to the region server where it was hosted before it was restarted to make use of the blocks already cached for that region on that region server. This region transition is shown in the following charts. The following snippet shows the region assignments before the region server named worker6 was restarted (The highlighted region below is hosted on the region server worker6): Just after the region server was restarted, the region was immediately moved to worker3 as highlighted in the chart below: After the region server restarts, it starts sending the cache information to the master. The CacheAwareLoadBalancer running on the master uses this information to find out if the cluster needs to be balanced and generates a plan to move the regions around while taking into consideration the blocks already cached on the region servers and hence attempts to reassign the region back to the old server where it was hosted earlier. The chart below shows the region assignment after the balancer is run. Here we can see that the region is assigned back to worker6 after the balancer is run. Multiple Region Servers restart In this experiment, multiple region servers in the cluster were restarted simultaneously to observe the impact on the cache after the balancer run finished. The following charts show the impact on the cache when the multiple region servers were restarted. The charts below show the impact on the cache when the cluster used the StochasticLoadBalancer and when the cluster used the CacheAwareLoadBalancer. Multiple region servers restarted with StochasticLoadBalancer Multiple region servers restarted with CacheAwareLoadBalancer Cluster restart This experiment was conducted to observe the impact on the cache when the whole cluster is restarted with the cache fully warmed up. The chart on the left shows the impact on the cache when the cluster is running with the StochasticLoadBalancer while the chart on the right shows the impact on the cache when the cluster is running with the CacheAwareLoadBalancer: Cluster restart with StochasticLoadBalancer Cluster restart with CacheAwareLoadBalancer The above charts show that the cluster did not have any impact on the cache when the whole cluster was restarted and the cluster immediately had all the cache restored to the state prior to the cluster restart. Rolling restart In this test, the cache in the cluster was fully warmed up before a rolling restart operation was performed. The observations made during the cluster restart operation are summarized below: The chart on the left shows the impact on the cache on a cluster running StochasticLoadBalancer while the chart on the right shows the impact on the cache on a cluster running CacheAwareLoadBalancer: Rolling restart with StochasticLoadBalancer Rolling restart with CacheAwareLoadBalancer The chart above shows that there was a minimal impact on the cache when the rolling restart operation was performed on the cluster running the CacheAwareLoadBalancer. Conclusion The CacheAwareLoadBalancer helps HBase to retain the cached data reducing the need to read it back from the underlying high latency cloud storage, whilst keeping regions distribution even among the cluster. Without this feature, read performance for COD with cloud storage can be compromised considerably, as the movement of regions could result in more cache misses and higher latency for client reads. Disabling the balancer altogether could help avoid such impacts to the cache, but that increases operations complexity, requiring constant monitoring and manual region movement to avoid RegionServer hotspotting.

Online	Offline
Last Visited	‎10-15-2024 09:23 AM

Member Since	‎04-24-2024 09:34 AM
Last Visited	‎10-15-2024 09:23 AM
Posts	2
Kudos received	1

Cloudera Community

Performance comparison of COD on AWS, Azure and GC...

Cache Aware Load Balancer in Apache HBase