Community Articles

Find and share helpful community-sourced technical articles.
avatar
Cloudera Employee

Arpit Agarwal, Shiv Moorthy, Sergio Gago Huerta

1. Executive Summary

1.1. Strategic Imperative and Market Position

Cloudera Object Store is the company’s next-generation, on-premises storage solution powered by Apache Ozone. Its development is a strategic imperative for the evolving needs of enterprise data platforms, particularly for analytics, artificial intelligence (AI), and machine learning (ML) workloads. Cloudera Object Store is architected from the ground up as a scalable, distributed object store. This design choice addresses the "small files problem" and single-namespace limitation, enabling high-performance metadata operations, dense storage architectures, and support for a massive number of objects.

By seamlessly integrating with the broader Big Data analytics ecosystem and offering Amazon S3 API compatibility, Ozone is a foundational enabler for modern, hybrid-cloud data lake architectures. The open-source platform's ability to run on commodity hardware provides a cost-effective solution for enterprises seeking to manage exabyte-scale data without being locked into proprietary hardware stacks or cloud-only vendor ecosystems.

1.2. Key Findings and Opportunities

Apache Ozone is a rapidly evolving object storage platform strongly supported by Cloudera. It has gained significant traction within the enterprise sector, especially among large organizations running high-performance analytics stacks at massive scale. While Cloudera's core team drives a large percentage of the project's code changes, a robust community of contributors provides essential support and practical feedback.

Notable open source deployments at large companies like Didi China and Preferred Networks (Japan) validate Ozone's scalability and its capacity to manage immense data volumes and billions of files. These real-world applications also revealed specific performance and operational challenges, such as the necessity for fine-tuning for low-latency, metadata-intensive workloads. Cloudera has developed numerous product improvements and a specific knowledge base that it has made available to its customers to optimize the performance of their clusters. 

A feature-by-feature comparison (see section 4.5) with alternatives like AWS S3, Dell ObjectScale, Minio, and Ceph shows that Ozone excels in its deep Big Data Analytics integration and unique features, such as enterprise-grade snapshots and native filesystem API support. Additional benchmarks on different versions may share more information but the current data is clear on performance metrics.

We at Cloudera are working with the Ozone community to develop Automated Data Lifecycle Management to optimize cost and efficiency, as well as improved operational tooling. In addition, we are working on bringing vectorized object storage to the enterprise.

1.3. Future Direction for Cloudera Object Store

Currently planned and in-progress product improvements have directly addressed areas toward product usability and market competitiveness

In addition, Cloudera has recently demonstrated bringing Vectorized AI Storage to the enterprise with Cloudera Object Store to:

  1. Extend its S3-compatible API to support vector embeddings natively, similar to Amazon S3 Vectors.
  2. Treat vectors as first-class citizens in the object store (stored alongside documents, images, logs, or tabular data).
  3. Support the S3 APIs similar to AWS S3: create, put, get, list, delete - vectors and vector indexes in a vector bucket.
  4. Integrate with metadata, governance, and SDX for trusted AI pipelines.

These are just a few of the innovations in development  to meet the evolving needs of the hybrid-cloud platform. 

2. The Business and Market Landscape of Apache Ozone

2.1. Customer Adoption and Scale: A Nuanced View

The adoption of Apache Ozone within Cloudera's customer base has demonstrated significant and strategic growth that capitalizes on the company's many enterprise relationships. This adds on the extensive Ozone footprint of enterprises using the community edition.

Cloudera's data and AI platform now manages more than 26 exabytes of data on HDFS and a rapidly growing 1+ exabyte on Ozone. Several major customers in the financial, telecom, and media verticals have committed to Ozone.

Ozone has evolved significantly to be a mission-critical component for Cloudera's most significant enterprise clients. It also demonstrates the value proposition of Ozone as a seamless upgrade path for HDFS users, showing a high degree of confidence from Cloudera's long-term enterprise clients who are expanding their use of the platform and migrating data to it.

2.2. The Open Source Ecosystem

Apache Ozone's development model is a prime example of a healthy, enterprise-backed open-source project. A significant number of the committers are from Cloudera. Cloudera's dedicated team helps drive core architectural changes, addresses major customer-reported issues, and implements the strategic features outlined in the product roadmap.

The broader community's contributions are equally valuable. They represent real-world use cases and provide essential scale improvements, bug fixes, documentation improvements, and fresh perspectives, which increases the product's overall quality and reliability.

This model provides enterprise customers with the assurance that a committed team is guiding the project's long-term vision, while also benefiting from the collective innovation and battle-testing of a diverse, global community.

2.3. Planned Product Investments and Priorities

The Apache Ozone product roadmap is a direct and transparent response to the challenges faced by its customers. Key priorities include improving Erasure Coding performance, manageability and operational pain points with features such as Rapid Deployment, Auto-Tuning Configuration, and Zero-downtime upgrades (ZDU), which will enable customers to upgrade more regularly with higher confidence.

3. Real-World Implementations and Apache Ozone Community Case Studies

3.1. Didi: Scaling to Exabytes with Custom Optimizations

Didi is a large Chinese technology company that operates a comprehensive transportation platform, primarily known for its ride-hailing services but also offering a wide array of options including bike sharing and food delivery.

Didi's deployment of Apache Ozone serves as a powerful testament to the platform's scalability and its ability to handle immense, real-world workloads. For more than two years, Ozone has been running in production at Didi, where it manages hundreds of petabytes of data and tens of billions of files.4 This case study is particularly valuable because it highlights both Ozone's foundational strengths and the areas where it requires custom engineering to meet extreme performance demands.

Didi's internal workload is read-heavy and write-light, with a high sensitivity to first-frame read latency. The team encountered performance bottlenecks with the Ozone Manager (OM), which, in a standard high-availability (HA) setup, handles all read and write requests through a single leader node. To overcome this, Didi developed and implemented a series of custom optimizations that were later contributed back to the community.

One of the most significant architectural improvements was the implementation of OM Follower read functionality for their S3 Gateway (S3G). This allowed them to distribute read requests across follower nodes, which dramatically reduced the pressure on the OM Leader. The results were impressive: the P90 latency for S3G downloads dropped from an average of 90ms to just 17ms, with best-case scenarios achieving latencies of less than 3ms.

This optimization demonstrates that Ozone's core architecture is sound for scalability and  optimizations are possible to meet the demands of low-latency, read-intensive, and large-scale environments.

Didi also addressed performance issues on their HDD-based servers by designing a heterogeneous storage-based caching system. This solution utilizes high-speed NVMe drives as a cache for hot data, with a specific strategy of caching the first 1 MB chunk of each block. This approach significantly improved first-frame read speeds and demonstrates how a hybrid hardware strategy can mitigate disk latency and IOPS limitations of traditional HDDs under heavy load.

Finally, Didi's observations on metadata management are crucial. The team found that an OM with a 128 GB heap size and more than 1 TB of SSD storage could manage a RocksDB database of 400 GB to 500 GB, which is sufficient for approximately 5 billion files. This provides a practical, real-world metric for sizing and scaling Ozone clusters based on file count. The Didi case study, therefore, serves as a powerful blueprint for how to deploy, optimize, and operate Apache Ozone at an enterprise scale, while also providing a clear roadmap for future improvements to the core project.

3.2. Preferred Networks: A Success Story of Migration and Community Engagement

Preferred Networks is a Japanese technology company that specializes in applying artificial intelligence and deep learning to solve real-world problems in areas like robotics, manufacturing, and life sciences.

The experience of Preferred Networks (PFN) in their adoption of Apache Ozone provides a textbook example of a successful HDFS migration and the value of active community participation. PFN, an AI-focused company, was seeking a storage system that could scale "pseudo-infinitely" to address several limitations they faced with their existing HDFS clusters.5 

Their challenges were common to many exabyte-scale analytics users: the small files problem overburdening the NameNode, poor performance with high-density disk servers, and a lack of native integration with modern, Kubernetes-based workflows.

PFN found that Ozone, with its object-based architecture and separation of metadata and data, was purpose-built to solve these very issues. They began a multi-phase migration, starting with a small-scale benchmark and then opening the service to internal users. A key finding from their experience was the need for an optimal pipeline distribution and the importance of using high-speed NVMe drives for metadata storage to ensure cluster stability and performance.

What sets PFN's story apart is their deep and active engagement with the Apache community. They did not simply consume the open-source software; they actively identified shortcomings, reported bugs (including a security vulnerability, CVE-2020-17517), and submitted patches to address issues like Multipart Upload optimization and ListObjects bugfix. This demonstrates a vital feedback loop between enterprise users and the open-source project, which is essential for the long-term health and maturity of a platform. The PFN case study also highlighted a specific limitation: the 5TB file size limit in the S3 API (this is an S3 API limitation and not an Ozone limitation), which posed a challenge for migrating some of their largest HDFS files. A workaround to this is to use Ozone’s filesystem interface which does not have the same 5TB limitation.

3.3. Shopee: Success Driving Apache Ozone with Custom Development

Shopee is the leading e-commerce online shopping platform in Southeast Asia and Taiwan.

Shopee has successfully integrated Apache Ozone as a core component of its data storage infrastructure, demonstrating a journey of significant growth and technical innovation.3

Since adopting Ozone 1.0 in February 2021, Shopee has scaled its deployment to manage billions of objects and over 10 petabytes of data. Their infrastructure spans multiple countries and IDCs, with the main cluster utilizing a robust setup of multiple Ozone Manager (OM) and Storage Container Manager (SCM) services to ensure high availability and performance. This large-scale, geographically distributed deployment underscores their confidence in Ozone's stability and capability to handle massive data loads.

A key factor in Shopee's success has been its proactive approach to software maintenance and customization. They have enhanced core functionalities such as authentication, authorization, and traffic control on the S3 Gateway, ensuring a secure and efficient storage environment. By maintaining their own fork while contributing back to the community, Shopee benefits from both the innovation of the open-source project and the flexibility of a customized internal solution.

One of the most impactful proprietary features developed by Shopee is its Object Lifecycle Management service. This was created to address the business need for automatically cleaning up a vast number of unused objects without manual intervention from administrators. The service empowers users to define their own rules for object expiration based on prefixes or tags, mirroring the functionality of AWS S3 lifecycle policies. Rolled out in June 2023, this feature has been met with positive user feedback and has proven essential for managing storage costs and efficiency. Shopee is now in the process of contributing this valuable object expiration feature back to the Apache Ozone community, showcasing their commitment to collaborative development.

Furthermore, Shopee has leveraged Ozone's architecture to implement a sophisticated Storage Class system to balance performance and cost-effectiveness. By mapping different storage media like SSDs and HDDs to "HOT," "WARM," and "COLD" policies, they can cater to varying data access patterns, from high-performance needs for frequently accessed data to cost-saving measures for archival data. Shopee's journey with Apache Ozone serves as a powerful testament to how deep engagement, strategic customization, and community collaboration can lead to a highly successful and scalable data storage solution.

3.4. Key Takeaways and Success Patterns

The case studies of Didi, Preferred Networks and Shopee underscore several common success patterns for Apache Ozone adoption. All companies represent demanding, petabyte-scale environments where Ozone was selected as a strategic replacement for HDFS to address specific, well-documented limitations. Their success required deep technical expertise, strategic hardware investments (e.g., heterogeneous storage), and a willingness to actively contribute to the open-source project. The optimizations and bug fixes from these deployments provide invaluable real-world validation and serve as a crucial feedback mechanism for shaping the product's future roadmap.

4. Feature and Capability Gap Analysis vs. Competitors

4.1. The Cloud Standard: Apache Ozone vs. AWS S3

AWS S3 has long set the de-facto standard for object storage, and Apache Ozone's S3 API compatibility is a strategic move to integrate with the vast ecosystem of S3-based applications and tools. While Ozone offers a strong on-premises solution, a detailed comparison reveals both alignment and key differentiators. Ozone's architecture, with its separation of metadata and data, mirrors the core design principle of S3, enabling it to handle billions of objects and exabyte-scale data.1 Ozone also supports crucial data protection methods like replication and erasure coding, ensuring data durability and availability.

The primary gap lies in the "as-a-service" features of AWS S3. S3 offers advanced, integrated capabilities such as Intelligent-Tiering that automatically moves data to different storage classes based on access patterns.6 It also provides dedicated, low-cost archival tiers like Glacier and Glacier Deep Archive for long-term data retention. While Ozone's roadmap includes a Data Tiering Phase 1 feature, it is still a developing area.

Ozone's key differentiator against S3 is its deep integration with the Big Data analytics ecosystem and its unique, patented Snapshot feature. Unlike cloud providers and other on-premises storage competitors, Ozone provides atomic point-in-time snapshots of an entire bucket, which is a significant advantage for enterprises that require point-in-time data recovery and efficient backups for their workloads.

The roadmap's focus on S3 Lifecycle and S3 Vault also indicates a clear intent to close the feature gap and provide a more comprehensive, S3-like experience on-premises.

4.2. The On-Premises Appliance Model: Ozone vs. Dell ObjectScale

Dell ObjectScale represents a different paradigm for on-premises object storage. Unlike Ozone's software-defined, commodity hardware approach, ObjectScale is a turnkey, containerized appliance built on Dell PowerEdge servers.7 

The core technical differentiator of ObjectScale is its chunk store architecture.9 It packs multiple small files into 128MB chunks before applying 12+4 erasure coding.8 This provides significant advantages for small-file performance and reduces rebuild times in the event of a drive failure. This is a direct answer to the small file performance challenge that plagues many object storage systems. This is also being addressed by the Ozone community.

Ozone's primary competitive advantage is its flexibility and its permissive open-source license. It can be deployed on a wide range of commodity hardware, providing a more cost-effective and vendor-agnostic solution. While Dell is tightly integrated with their hardware stack, Ozone's open-source model allows for greater customization and community-driven innovation.

4.3. The Lightweight Contender: Ozone vs. Minio

Minio has emerged as a popular, lightweight, and high-performance object store, positioning itself as the go-to solution for cloud-native and AI/ML workloads.10 Minio's key strengths lie in its simplicity of deployment and its strong focus on being a single-purpose, fully S3 API-compatible object store.11 

The community's discussion reveals concerns about Minio's business model and licensing.12 Reports of Minio weakening the community edition by stripping features from the web UI and changing its license have created a trust issue for long-term enterprise use. In this context, Ozone's permissive Apache 2.0 license and its large enterprise backing from Cloudera provide a substantial competitive advantage. Ozone's primary value proposition is its stability, and a clear, enterprise-focused roadmap, which may be more appealing for organizations seeking a durable, long-term solution free from licensing uncertainty.

4.4. The Unified Storage Solution: Ozone vs. Ceph

Ceph is a mature and comprehensive distributed storage system that provides object, block, and file storage from a single cluster.13 This unified architecture is Ceph's greatest strength and its primary differentiator from Ozone, which is a purpose-built object store.

Despite its versatility, Ceph's complexity can be a significant drawback. Community opinions are mixed, with some users praising its no-SPOF (Single Point of Failure) design and scalability, while others cite a "steep learning curve" and abysmal performance in certain setups, particularly for single-threaded I/O.11 The higher management complexity translates to increased labor costs and a higher operational overhead.11

Ozone's strength, by contrast, is its specialization. By focusing exclusively on big data object storage and integrating seamlessly with the Hadoop ecosystem, it offers a more tailored and potentially more performant solution for its target workloads.

4.5. Competitor Feature Comparison

Table 2: Feature Matrix: Apache Ozone vs. Competitors

Feature

Apache Ozone

AWS S3

Dell ObjectScale

Minio

Ceph

Architecture

Distributed object store with separate metadata and data services

Cloud-scale object storage service

Turnkey, containerized appliance with microservices

Lightweight, containerized object store

Unified, distributed storage system (block, object, file)

Core Functionality

Object storage, HDFS/OFS compatibility

Object storage as a service

Object storage

Object storage

Object, Block, and File storage

S3 API Compatibility

Yes, with S3 Gateway (S3G)

Native S3 standard

Yes, S3 protocol support

Native S3 compatibility

Yes, via RADOS Gateway (RGW)

Data Protection

Replication, bit-rot detection, self-healing,  and 6+3 and 10+4, Erasure Coding.

Multi-Availability Zone replication, Erasure Coding

12+4 Erasure Coding, Triple Mirroring

Erasure Coding, Bit-rot detection

Replication, Erasure Coding

Target Workloads

Big Data Analytics, HDFS Upgrade , Data Lakehouses, AI/ML

Cloud-native apps, Backup/Archive, Data Lakes, AI/ML

Enterprise AI/ML, High-density workloads

Cloud-native apps, AI/ML, Edge computing

Large-scale enterprise, General purpose storage

License

Apache 2.0

Proprietary

Proprietary

GNU AGPL v3 (open source), AIStor (commercial)

LGPL (librados), Apache 2.0 (other parts)

Key Differentiators

Deep Big Data Analytics ecosystem integration, Patented Snapshot feature, Open source with strong enterprise backing

Vast ecosystem, advanced integrated services, global scale, deep archiving tiers

Turnkey appliance, Chunk Store for small files, validated hardware

Simplicity, high performance, cloud-native focus

Unified storage, CRUSH algorithm, highly flexible

 

5. The Future of Cloudera Object Store

5.1. Near-Term Roadmap and Planned Features

The future of Apache Ozone is shaped by a clear, customer-centric roadmap that directly advances the product's core capabilities  and strategic opportunities. In the next few quarters, we will deliver several customer requests including Storage Container Reconciliation and Snapshot Performance, Disk Balancer, Operational Database support and Container Reconciliation.

These features are carefully chosen to enhance the platform's resilience, close a key feature gap against cloud competitors, and signal a continued focus on supporting a wider range of enterprise big data workloads and improving "day 2" operations.15

5.2. Addressing Feature Gaps and Customer Commitments

The planned roadmap represents a strategic effort to transform Ozone from an evolution beyond HDFS into a foundational layer for the modern data lakehouse. By focusing on Ozone and Iceberg Metalake as well as HBase support amongst others, the community and Cloudera’s development team are building a platform that goes beyond simple object storage to support transactional and analytical workloads.1 This is a direct response to the market demand for a unified, on-premises solution for AI, analytics, and data lakehouse use cases.

The Snapshot Phase project is a dedicated effort to address the scalability issues of snapshots reported by a major customer, while the new Disk Balancer service is designed to solve the non-uniform data distribution that can occur in large-scale clusters. Similarly, the planned S3 Lifecycle management feature directly addresses a major feature gap against AWS S3. The planned features collectively demonstrate that the Ozone community and its core engineering team are highly responsive to customer needs and are committed to evolving the platform to meet the demands of modern data ecosystems.

6. Conclusion

Apache Ozone is a mature, capable, and strategic choice for organizations seeking a scalable, on-premises object storage solution. It has successfully addressed many customer-driven innovations, particularly in its ability to manage billions of small files and its architectural separation of metadata and data. The product has achieved significant adoption among Cloudera's large enterprise customers. The active participation of prominent companies like Didi and Preferred Networks in its open-source development provides invaluable, real-world battle-testing and demonstrates its viability at exabyte scale.

It is a foundational component for the modern data lakehouse, enabling organizations to run a single storage platform for both object and transactional workloads.

References cited

  1. What is Apache Ozone? | Cloudera. https://www.cloudera.com/resources/faqs/apache-ozone.html
  2. Re: [DISCUSS] Apache Ozone 2.1 Release Discussion-Apache Mail Archives.  https://lists.apache.org/thread/d7d5zy7v9zmfbt9qw1rfbq79r126lc3o
  3. Apache Ozone Best Practices at Shopee -  The ASF Community Over Code conference. https://ozone.apache.org/assets/ApacheOzoneBestPracticesAtShopee.pdf
  4. Apache Ozone 在滴滴的实践. https://ozone.apache.org/assets/ApacheOzoneBestPracticesAtDidi.pdf
  5. A Year with Apache Ozone - Preferred Networks Research. https://tech.preferred.jp/en/blog/a-year-with-apache-ozone/
  6. Amazon S3 Intelligent-Tiering storage class -   https://aws.amazon.com/s3/storage-classes/intelligent-tiering/
  7. Dell ObjectScale. https://www.delltechnologies.com/asset/en-us/products/storage/technical-support/objectscale-spec-she...
  8. Dell ObjectScale Data Path Overview - Itzikr's Blog - WordPress.com. https://itzikr.wordpress.com/2024/02/02/dell-objectscale-data-path-overview/
  9. ObjectScale 4.1 - Unstructured Data Quick Tips. http://www.unstructureddatatips.com/objectscale-4-1/
  10. MinIO vs Ceph Benchmark: A Comprehensive Comparison - BytePlus. https://www.byteplus.com/en/topic/409655
  11. Minio vs. Ceph: A Deep Dive into Distributed Storage Solutions - AutoMQ. https://www.automq.com/blog/minio-vs-ceph-distributed-storage-solutions-comparison
  12. MinIO Faces Fallout for Stripping Functions from Open Source Version. https://www.futuriom.com/articles/news/minio-faces-fallout-for-stripping-features-from-web-gui/2025/...
  13. Object storage - IBM. https://www.ibm.com/docs/en/storage-ceph/8.0.0?topic=started-object-storage
  14. HDFS vs Ozone: comparison of features | ADH Arenadata Docs. https://docs.arenadata.io/en/ADH/current/concept/ozone/hdfs-vs-ozone.html
  15. The Apache Software Foundation Announces Apache Ozone™ 2.0.0 - The ASF Blog. https://news.apache.org/foundation/entry/the-apache-software-foundation-announces-apache-ozone-2-0-0
48 Views