Developer Blogs

Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

Building an Open Lakehouse with Apache Iceberg and Apache Ozone

avatar
Cloudera Employee

Open Lakehouse architecture brings in the modularity and flexibility needed to run multiple analytical workloads on top of a single source of data powered by open table formats like Apache Iceberg. Rather than relying on a tightly coupled, monolithic system, it allows teams to compose their data platform from independent building blocks - storage, table formats, catalogs, and compute engines, each selected based on specific workload and operational requirements.

Storage is one of the key components of a lakehouse architecture. It is the foundation on which table formats implement transactional semantics, ACID guarantees, metadata management, and enable multi-engine interoperability. Decisions made at the storage layer directly influence how reliably tables can evolve, scale, and be shared across systems.

Apache Iceberg: Database semantics on Object storage

Apache Iceberg implements ACID properties on top of object storage and provides a schema to refer to the data files (e.g. Apache Parquet) as a “table.” At its core, Iceberg separates - physical data storage (immutable Parquet/ORC/Avro files) and logical table state (schemas, partitions, snapshots, file-level statistics). This separation is fundamental to how Iceberg behaves and enables multiple compute engines to work together on the same table at the same time. 

Instead of relying on directory layouts or file naming conventions, Iceberg maintains explicit metadata files that describe:

  • Which data files belong to the table
  • How they are partitioned
  • Which snapshot represents the current table state
  • How the table evolved over time

As tables evolve, both metadata and data files grow in number, making the behavior and scalability of the underlying object storage a first-class concern for Iceberg deployments.

Apache Ozone: Object Store for Lakehouse

Apache Ozone is designed with specific requirements in mind. It is a highly scalable, distributed object store built to support growing data volumes while also handling large numbers of smaller objects without exhausting metadata capacity. This makes it well-suited for lakehouse systems where table formats continuously generate new data and metadata artifacts as tables evolve.

ozone_blog.png

What does Apache Ozone bring to the “Table”?

Before diving into the hands-on workflow, it’s worth briefly summarizing what Apache Ozone brings as the storage system.

  • Open-source, cloud-native object and file storage
    Apache Ozone is an open-source, cloud-native storage system designed to scale to billions of objects and hundreds of petabytes. Its architecture is built for distributed deployments, making it suitable for large analytical platforms where storage growth is continuous and long-lived.
  • Dual-access semantics: object and filesystem APIs
    Ozone supports native S3-compatible access for modern data platforms while also exposing traditional filesystem semantics (OFS). This dual-access model allows the same underlying data to be accessed via both ways, enabling gradual migration and mixed workloads within a single deployment.
  • Strong consistency without a centralized metadata bottleneck
    Ozone provides strong consistency guarantees while avoiding the traditional NameNode bottleneck by fully decoupling metadata management and storage. The metadata plane (Ozone Manager) and storage plane (Storage Container Manager) operate independently and are coordinated using Apache Ratis, a Raft-based consensus protocol. This design enables scalable metadata operations without sacrificing correctness.
  • Proven at petabyte scale in production
    Ozone has been validated in large-scale production environments, supporting petabyte-scale datasets and high object counts. This makes it a practical storage foundation for high transactional systems.

These characteristics make Apache Ozone a strong fit as a general-purpose object store for lakehouse architectures. When paired with an open table format like Apache Iceberg, these same properties directly address the storage challenges that emerge as tables grow, evolve, and accumulate both data and metadata over time.

Ozone addresses several core challenges that could be common in Iceberg-based lakehouse deployments.

Scalable storage for data growth: Analytical datasets grow both in data size and in object count as tables are ingested, partitioned, rewritten, and optimized over time. Ozone is designed to scale both dimensions independently, distributing data across storage nodes while maintaining a consistent object namespace.

Efficient handling of small objects: Beyond raw data growth, Iceberg tables generate a huge amount of metadata as part of normal table write or evolution. Ozone is built to handle large volumes of small objects without saturating metadata capacity, which is critical in lakehouse systems where metadata growth is intrinsic.

Built-in durability, security, and availability: Ozone provides enterprise-grade storage features required in production lakehouse environments, including data encryption for security, erasure coding for storage efficiency, and replication for fault tolerance. These capabilities allow Ozone to serve as a durable system of record for both Iceberg data and metadata over long table lifecycles.

S3-compatible access: Ozone exposes a native Amazon S3–compatible API, allowing Iceberg and multiple compute engines to interact with tables using standard object storage interfaces. As a result, Iceberg tables stored in Ozone follow the same layout and semantics as they would on cloud object storage.

Exploring Apache Iceberg with Apache Ozone

With the architectural context in place, let’s go through a hands-on exploration of Apache Iceberg tables with Apache Ozone as the storage system. Our goal is not to understand the Iceberg APIs here, but rather to see what the Ozone + Iceberg combination brings.

All examples in this section are driven by a preconfigured Jupyter notebook available in the GitHub repository. You can run the notebook end-to-end to perform a sequence of table operations (create, write, evolve, update, delete). Rather than walking through each notebook cell, the sections below highlight what to observe in Apache Ozone as those operations execute.

The notebook is preconfigured to use:

  • Apache Iceberg as the table format
  • An Iceberg REST catalog backed by object storage
  • Apache Ozone (via its S3-compatible interface) as the storage layer for both data and metadata
  • Apache Spark as the compute engine

Iceberg table layout materialized in Apache Ozone

Once you write to an Iceberg table, the first thing to observe is the creation of the /metadata and /data directories (of Iceberg) in Ozone’s file system (as seen below).

Screenshot 2026-01-27 at 3.54.04 PM.png

You can inspect this layout directly using any S3-compatible client against Ozone’s S3 Gateway:

aws s3api --endpoint-url http://s3.ozone:9878 \
  list-objects \
  --bucket warehouse \
  --prefix nyc/taxis/metadata/

Screenshot 2026-01-27 at 3.56.52 PM.png

aws s3api --endpoint-url http://s3.ozone:9878 \
  list-objects \
  --bucket warehouse \
  --prefix nyc/taxis/data/

Screenshot 2026-01-27 at 3.57.05 PM.png

As you progress through the notebook - writing data, evolving schemas, and performing updates or deletes, you will notice that the number of objects under metadata/ continues to grow. From a storage perspective, this means that metadata growth is intrinsic. Even modest tables can accumulate a large number of small metadata objects over time. So, we need to keep an eye out on this.

Observing Iceberg activity through Apache Ozone management UIs

One advantage of using Iceberg with Apache Ozone is the ability to observe storage behavior directly through Ozone’s management interfaces. Ozone internally separates metadata management, physical storage placement, and observability into distinct services. Understanding these components helps explain why Ozone is a strong storage foundation for open table formats like Iceberg.

Ozone Recon: cluster and object-level visibility

Ozone Recon is an observability and insight service that provides a consolidated view of cluster state, usage, and health. Recon is not involved in the read or write path, but it plays an important role in operating and debugging Ozone-backed lakehouse deployments.

Recon provides:

  • Cluster-level metrics and capacity insights
  • Object (key) counts and namespace usage
  • Container and pipeline health information
  • Diagnostic views useful for troubleshooting storage-related issues

When running Iceberg workloads, Recon allows operators to correlate table activity with storage behavior. As Iceberg generates new data and metadata objects over time, Recon makes it possible to observe how object counts, container usage, and cluster health evolve.

Screenshot 2026-01-27 at 3.58.57 PM.png

In the Overview view, we can observe how the storage layer behaves as Iceberg operations are executed. In this run, the cluster remains healthy with 1 active datanode and 2 healthy containers. At this point, Recon reports 5  keys in the namespace, reflecting the data and metadata objects created by the Iceberg table. Despite ongoing object creation, container health and pipeline status remain stable, indicating that the workload does not introduce stress or instability at the storage layer.

From a capacity perspective, the cluster shows ~40.5 GB used out of 452.1 GB available (9-10% utilization), with Iceberg-related data accounting for a small but growing portion of overall usage. This highlights an important aspect of Iceberg workloads: storage growth happens incrementally and continuously as tables evolve.

Screenshot 2026-01-27 at 7.56.28 PM.png

Recon’s Insights view adds another layer of visibility into this behavior. The file size distribution reveals a mix of object sizes produced by the Iceberg workload, with multiple objects in the 8 KiB–16 KiB range alongside smaller objects in the 2 KiB–4 KiB range. This pattern reflects Iceberg’s operational model, where relatively small metadata and manifest files are created alongside larger Parquet data files.

Ozone Manager (OM): metadata plane health

Screenshot 2026-01-27 at 3.59.05 PM.png

The Ozone Manager is the master service responsible for managing Ozone’s object namespace and metadata. This includes volumes, buckets, keys, and the metadata required to map objects to their underlying storage blocks.

OM’s responsibilities are strictly at the metadata and namespace level:

  • Tracking object metadata (keys) created via the S3-compatible interface
  • Maintaining a consistent view of volumes and buckets
  • Coordinating metadata updates using Apache Ratis (Raft) to provide high availability and consistency

Since Iceberg implicitly relies on the storage system to durably and consistently persist object metadata as new files are written during table evolution, OM ensures that object metadata updates are replicated and consistently visible across the cluster.

Conclusion

This blog highlights what Apache Ozone brings to an Iceberg-based lakehouse without requiring any special integration. As Iceberg operations execute, Ozone consistently acts as a durable system of record for both data and metadata objects. It absorbs continuous object creation driven by table evolution, scales object count independently of data size, and maintains a stable namespace as tables grow and change over time. 

Equally important is visibility. Through Ozone Manager and Recon, object growth, namespace health, container placement, and cluster state can be inspected and correlated directly with table activity. This makes it easier to reason about, operate, and debug metadata-intensive lakehouse workloads.

In practice, what Apache Ozone brings is a storage foundation that aligns with the demands of modern table formats: scalable object storage, consistent metadata management, and first-class observability. When those properties are present, formats like Apache Iceberg can focus entirely on table-level semantics - while Apache Ozone reliably handles everything underneath.

Contributors