Created on 05-12-2026 09:18 AM - edited 05-12-2026 09:27 AM
I have been meaning to explore Apache Polaris for some time. Over the years, I have worked with a range of Apache Iceberg catalogs - Nessie, Hive Metastore, AWS Glue, Unity Catalog, and each of them reflects a different set of design trade-offs around metadata management, security, and interoperability. What makes Polaris particularly interesting to me is not just its technical direction, but the fact that it is evolving as a community-led project under the Apache Software Foundation.
If you have followed my work, you will know I tend to approach systems by first understanding how they are built rather than how they are used. That is the intent here. We will examine Polaris from the inside out - its system boundaries, core building blocks, and the design principles that shape its behavior. The goal is to understand Polaris from a system design perspective so we develop the depth needed for using these systems. Let’s go!
System Boundaries
Apache Polaris is a networked metadata service that implements the Iceberg REST Catalog specification while extending it with additional control-plane capabilities. It sits between compute engines and open table formats (such as Apache Iceberg) stored on storage systems, governing how metadata is interpreted, accessed, and evolved. To understand its behavior, let’s decompose the system along its external and internal boundaries.
Here is a diagrammatic breakdown to follow along.
1. API Surface
At the outermost boundary, Polaris exposes a set of HTTP APIs that define how external systems interact with it. These APIs fall into two families.
The catalog API surface implements the Iceberg REST specification and is used by engines such as Spark, Flink, and Trino to perform table operations like creating tables, committing snapshots, or resolving metadata locations. Alongside this, Polaris exposes a management API surface that operates on principals, roles, catalogs, and other administrative constructs.
While these APIs differ in semantics, they are not separate systems. Both ultimately converge on the same execution path, which becomes important when reasoning about consistency, authorization, and failure modes.
2. Realm
All requests in Polaris are evaluated within the context of a realm, which acts as the primary isolation boundary. A realm defines a logically independent namespace for metadata, configuration, and policy.
When a request enters the system, it is bound to a specific realm, and all subsequent operations, i.e. metadata resolution, authorization checks, and persistence are scoped to that context. This allows a single Polaris deployment to host multiple tenants with strict separation, not just in metadata, but also in access control and operational behavior.
This concept is central to Polaris. Without understanding realms, it is difficult to reason about multi-tenancy, security boundaries, or even how catalogs are logically partitioned.
3. Runtime Layer
The runtime layer is implemented as a Quarkus-based server that hosts the REST endpoints and orchestrates request execution.It is responsible for parsing incoming requests, routing them to the appropriate handlers, and managing lifecycle concerns such as request context and dependency injection. However, it does not implement catalog semantics or metadata logic directly. Instead, it delegates all meaningful work to the domain layer.
This separation ensures that the transport layer remains decoupled from the system’s core logic, allowing Polaris to evolve its execution model independently of how requests are exposed or delivered.
4. Domain Layer
Beneath the runtime sits the domain layer, which contains the core components responsible for catalog logic, RBAC enforcement, entity modeling, and storage interaction policies. Components such as the metadata manager and credential scoping mechanisms are defined within this layer. The domain layer hosts the abstractions for catalog structure, authorization, and storage interaction, and acts as the execution target for requests delegated by the runtime.
5. Persistence Layer
Polaris persists its internal catalog state through a pluggable persistence layer that provides storage for entities such as catalogs, namespaces, principals, roles, and configuration. This abstraction allows Polaris to use different backends, such as JDBC or NoSQL systems, without coupling the system to a specific storage implementation.
6. External Systems
Polaris interacts with several external systems that lie outside its direct control.
On the storage side, Polaris integrates with object stores such as Amazon S3, Google Cloud Storage, or Apache Ozone. Polaris does not participate in data reads or writes during query execution. Instead, it integrates with these systems to validate configured locations and to facilitate access patterns used by compute engines.
On the identity side, Polaris integrates with authentication systems such as OAuth/OIDC providers. These systems establish the identity of the caller, which is then used within the domain layer to enforce authorization policies.
7. Extension Points
Polaris also exposes extension points that allow it to integrate with external systems. These include authorization plugins such as OPA and federation layers that enable interoperability with existing catalogs like Hive or Hadoop-based systems.
These extensions do not alter the core execution path, but they expand the environments in which Polaris can operate, allowing it to function as part of a broader, heterogeneous metadata ecosystem.
Now that we have gone over the system boundaries, the next step is to look at how Polaris actually behaves under real operations. The components described so far, i.e. API surfaces, runtime, domain logic, persistence, and external integrations do not operate in isolation. Every request that enters Polaris, is evaluated, transformed, and executed through these layers in a consistent way.
The boundaries describe where different parts of the system reside, but they do not explain what actually drives behavior. Every request ultimately depends on a small set of abstractions that are consistently applied across operations. These abstractions determine how state is represented, how access is enforced, how changes are persisted, and how those changes are applied to Iceberg tables.
In the next section, we focus on these core building blocks.
At the center of Polaris is an internal entity model that represents the catalog as a structured graph of objects. This model includes catalogs, namespaces, and table-like entities, as well as principals, and two kinds of roles - catalog roles (scoped to a catalog), and principal roles (scoped to the account). Table- and namespace-level privileges attach to catalog roles. Catalog roles are granted to principal roles, so a principal role bundles access across one or more catalogs. Principal roles are granted to principals, which are the users or service identities that call the APIs.
Entities are arranged hierarchically, with a root-level structure that organizes catalogs and their contained namespaces and tables. Each entity type (PolarisEntityType) has a well-defined role:
This entity graph is the logical state of Polaris. All API operations, whether they originate from Iceberg REST calls or management interfaces, ultimately translate into reads or mutations of this graph. It is important to distinguish this state from Iceberg’s own metadata. The authoritative table and snapshot history live in Iceberg metadata files; Polaris may cache a small set of denormalized fields (for example, pointers and summary ids) on entities such as IcebergTableLikeEntity for catalog operations, but it does not store full schemas or snapshot histories as the system of record. It primarily maintains references to those structures while managing the higher-level organization and governance of tables.
Every operation in Polaris is evaluated against an authorization system before it is allowed to proceed. This system is built around the concept of an explicit operation abstraction, where each request is mapped to a well-defined action (for example, creating a table or loading metadata).
These operations are checked against privileges defined in the entity model. In the default configuration, Polaris uses a role-based access control (RBAC) model, where privileges are granted to roles and roles are assigned to principals through the two role layers (catalog and principal) described in the above section.
The evaluation determines whether the requesting principal PolarisPrincipal is allowed to perform the given operation on the target entity.
The design separates the notion of an operation from the underlying privilege model. This allows Polaris to support alternative authorization mechanisms, such as delegating decisions to external policy engines, without changing the semantics of API-level actions. Regardless of the implementation, authorization is always on the critical execution path, i.e. no request can reach the underlying metadata or Iceberg layers without first passing through this check.
Polaris maintains its entity model through a persistence abstraction that separates logical operations from the underlying storage implementation. At the lowest level, BasePersistence (and related) interfaces provide atomic read and write capabilities over entities - for example, writeEntity with compare-and-swap style expectations. This interface is implemented by concrete backends, which may use relational databases or NoSQL systems.
Above this interface sits a higher-level component responsible for orchestrating entity lifecycle operations, such as creation, updates, and deletion, as well as managing related concerns like grants and policies mappings. The PolarisMetaStoreManager interface is that metastore manager, providing a unified entry point for all catalog state mutations.
This separation is important for two reasons. First, it ensures that the core logic of Polaris is not tied to a specific storage technology. Second, it allows Polaris to treat entity-level operations as atomic and compare-and-swap–friendly; backends implement that using their own transaction or batching rules (for example, JDBC uses database transactions).
The persistence layer stores catalog state, not Iceberg table metadata. In practice, this means it records information such as the existence of a table entity, its namespace path, and the location (metadata-location) of its Iceberg metadata file, but not the contents of that metadata.
A storage integration layer governs how tables map to object storage and how access to that storage is controlled. Each catalog (and optionally namespaces in the hierarchy) can carry storage configuration that defines allowed locations and access patterns.
When operations involve interaction with storage - such as creating a table or loading metadata, Polaris validates that the specified locations conform to catalog-level constraints. In addition, it may generate scoped credentials that grant compute engines temporary access to specific paths in object storage.
This layer is crucial in multi-tenant environments. It ensures that access to data is constrained not only by logical permissions but also by physical boundaries in storage.By mediating credentials and validating paths, Polaris enforces data isolation at the storage level. It does not act as the data plane for query execution on table files - clients use vended credentials to reach object storage directly, while Polaris may still read or write Iceberg metadata as part of catalog operations.
The final and most critical building block is the execution bridge between Polaris and Iceberg. Polaris does not implement its own table format or metadata protocol. Instead, Polaris’s IcebergCatalog (Iceberg catalog implementation) delegates table and view operations to Iceberg’s APIs, using them as the authoritative mechanism for managing table state.
When a request such as table creation, view creation, or schema evolution is processed, IcebergCatalog translates the high-level intent into calls against Iceberg’s catalog and table or view abstractions. These calls ultimately trigger the Iceberg commit protocol, which produces new metadata files and updates snapshot or version lineage in object store (for tables) or the corresponding view metadata (for views).
This bridge is where Polaris moves from a control-plane service to an orchestrator of table and view metadata transitions. Requests are authorized and validated in the catalog context early in the path; execution then goes through Iceberg’s catalog (IcebergCatalog) and TableOperations (or ViewOperations) contract, which writes new table metadata in object storage and runs Iceberg’s commit protocol. After a successful commit, Polaris records its own view of the table, i.e. the entity identity, metadata-location , and selected denormalized fields, in the persistence layer.
The separation between Polaris state and Iceberg state is essential. Iceberg remains the source of truth for table structure and evolution, while Polaris governs how and when those transitions occur, and how they are exposed to different users and systems.
The system boundaries describe where different components reside, and the core building blocks define the abstractions that govern behavior. However, neither fully explains how a request is executed end-to-end.
In practice, every request in Polaris follows a consistent execution model, regardless of whether it originates from the Iceberg REST API or management interfaces. The differences between operations arise not from distinct execution paths, but from how each step interacts with catalog state, Iceberg metadata, and storage. At a high level, request execution in Polaris can be described as a sequence of transformations:
The earlier building blocks map directly onto this flow:
This execution model is invariant across operations. What changes is whether an operation terminates at the catalog layer, or proceeds into Iceberg metadata and storage.
In the next section, we apply this model to three representative operations.
Execution of Common Operations in Polaris
The execution model defined earlier provides a uniform lens through which all Polaris operations can be understood. In this section, we apply that model to three representative operations, each illustrating a distinct interaction pattern with catalog state and Iceberg metadata.
Let’s go over these three operations:
The LIST_TABLES operation represents the simplest execution path among the common catalog operations in Polaris. It is a catalog read that does not require interaction with Iceberg’s metadata commit protocol or object storage. Instead, it is resolved entirely within Polaris’ own catalog state.
The request enters through the Iceberg REST catalog API (API Surface) and is immediately bound to a Realm-specific context, ensuring that all subsequent execution is scoped to the correct tenant. At this stage, the Runtime layer performs minimal processing and delegates execution to the appropriate catalog handler.
Before any data is accessed, the request is evaluated by the Authorization layer. The operation is mapped to a well-defined action (such as listing tables within a namespace), and the system verifies that the requesting principal holds the necessary privileges on the target namespace. This step is mandatory and ensures that even read-only operations are subject to policy enforcement.
Once authorized, the system resolves the target namespace within the Entity model. The namespace acts as a node in the catalog graph, and the operation retrieves all table-like entities associated with it. This retrieval is performed through the Persistence abstraction, which accesses the underlying storage (e.g., relational or NoSQL backend) via the metastore manager. At no point does this operation invoke the Iceberg execution bridge (to load table metadata files) or interact with object storage. The result is constructed entirely from Polaris’ own catalog state, and the response is returned to the client.
An important point to understand here is that for operations like LIST_TABLES, Polaris does not discover tables by reading Iceberg metadata files in object storage (e.g., metadata.json). Instead, it enumerates persisted catalog entities maintained within its own catalog state. It maintains an independent catalog representation that can satisfy a class of operations without invoking Iceberg’s metadata layer. In doing so, it behaves as a true metadata service, not just a pass-through interface.
Among table lifecycle operations, CREATE TABLE is the first point where Polaris must coordinate between its internal catalog state and Iceberg’s metadata model. Unlike LIST_TABLES, this operation involves both catalog mutation and metadata initialization, requiring interaction with storage and Iceberg’s commit protocol.
The request again enters through the Iceberg REST API (API Surface) and is immediately scoped to a specific realm, ensuring tenant isolation. The runtime layer performs initial validation and delegates execution to the catalog handler, where the operation is interpreted in terms of Polaris’ domain model. The following section describes the direct creation path, where the table is immediately committed and registered. Staged creation follows a different execution path and is not covered here.
The first step in execution is authorization. The system verifies that the requesting principal has permission to create a table within the specified namespace. This evaluation is performed against Polaris’ RBAC model or any configured external policy system. In addition, Polaris enforces constraints at the catalog level - for example, rejecting creation attempts on static or federated catalogs that do not support mutation.
Following authorization, Polaris performs validation within the entity model. This includes verifying that the namespace exists and ensuring that no table with the same identifier already exists. Through these checks, the operation remains catalog-local. However, this phase does not fully capture all validation, as storage-backed checks, such as location validation and allowed-prefix enforcement are performed later as part of the commit path.
The next phase transitions execution into the Iceberg execution bridge. Polaris translates the validated request into calls against Iceberg’s catalog and table abstractions. This involves constructing a table builder, configuring it with schema, partition specification, and properties, and invoking the create operation. This invocation triggers Iceberg’s metadata commit protocol. A new table metadata object is constructed and serialized into a metadata file, which is written to object storage. This step relies on the storage integration layer, which ensures that the target location is valid and that appropriate credentials are available. Polaris may generate scoped credentials to enforce isolation while allowing the write to proceed.
Once the metadata file is successfully written, Polaris performs catalog registration through the persistence layer. The table is inserted into the entity model, recording its identifier, namespace, metadata location, and selected denormalized fields required for catalog semantics. This makes sure that subsequent operations can resolve the table directly through Polaris without recomputing its state.The response returned to the client includes the table’s metadata representation and, if applicable, access credentials.
This operation highlights the dual-state nature of Polaris. Iceberg metadata files remain the authoritative source of truth for table structure, schema, and evolution, while Polaris maintains a catalog-level representation that references this state and augments it with governance and access control information. These two states are updated in sequence and since these steps span storage and catalog persistence, consistency across them depends on successful completion of both phases.
While CREATE TABLE initializes table state, the ongoing evolution of tables is governed by the COMMIT_TRANSACTION operation. This is the core mutation primitive exposed by Iceberg and mediated by Polaris. Unlike CREATE TABLE, which operates on a single table, a commit transaction request may include updates for one or more tables within a single HTTP call, each of which is processed and committed independently. The request enters through the Iceberg REST API (API Surface) and is scoped to a specific realm, ensuring tenant isolation. The runtime layer parses the request and delegates execution to the transaction handler, which interprets the operation as a set of table-level updates.
Before any mutation is applied, Polaris evaluates authorization. The system verifies that the requesting principal holds sufficient table-level privileges for each target table, as defined by the operation semantics. This ensures that all updates within the transaction are permitted before execution proceeds. Once authorized, Polaris resolves each target table through its entity model, retrieving the current metadata location and associated configuration required to interact with Iceberg. This step binds Polaris’ catalog state to the corresponding Iceberg table state for every table included in the request.
Execution then transitions into the Iceberg execution bridge. For each table, Polaris applies the requested updates sequentially and invokes Iceberg’sTableOperations.commit as implemented by Polaris (BasePolarisTableOperations). This commit path performs validation, detects conflicts, and constructs a new metadata state, which is serialized into an updated metadata file and written to object storage via the storage integration layer. These commits are executed per table and rely on Iceberg’s compare-and-swap semantics to ensure metadata-level atomicity.
A key property of this phase is that atomicity is scoped to individual tables. Iceberg guarantees that each table’s metadata transition is applied atomically only if the commit succeeds without conflicts. Polaris relies on this mechanism for metadata correctness and does not implement a separate concurrency model for on-storage state. However, Polaris does maintain its own catalog state, and coordination between these layers is handled explicitly.
After all table-level commits are attempted, Polaris applies updates to its catalog state through the persistence layer. Rather than updating entities individually, these changes are accumulated and flushed in a conditional batch operation, ensuring that catalog entries remain aligned with the committed metadata. This particular update reflects changes such as metadata locations and selected denormalized fields maintained for catalog semantics.
As introduced in the CREATE TABLE flow, Polaris maintains a separation between Iceberg metadata state and its own catalog representation. In the context of COMMIT_TRANSACTION, this separation becomes more pronounced, with Iceberg handling atomic metadata evolution at the table level, and Polaris aligning its catalog state to reflect those committed changes. Together, this coordination ensures that table evolution remains consistent across both metadata layers while preserving Iceberg as the source of truth for table state.
Federation in Polaris
Polaris is typically introduced as a catalog that manages Iceberg table metadata through its own persistence layer. However, not all catalog states need to originate within Polaris itself. In many deployments, metadata already exists in external systems, such as another Iceberg REST catalog or a Hive Metastore (HMS). Rather than requiring migration, Polaris introduces federation as a mechanism to integrate with these systems while retaining control over access, policy, and execution semantics.
At a high level, federation in Polaris follows a broker pattern. Polaris continues to own the catalog identity within its entity model, including namespaces, RBAC bindings, and catalog-level configuration. However, when a catalog is configured with a connection to an external backend, the responsibility for metadata resolution and mutation is delegated to that system. Polaris acts as an intermediary, routing requests through its standard execution model while deferring table-level operations to the remote catalog.
As of today, there are two forms of federation:
In both cases, the external system remains the source of truth for table metadata, while Polaris acts as a control layer.
Conclusion
By examining Polaris’s system boundaries, building blocks, and execution model, we see that it is not just a catalog implementation, but a control plane for metadata systems.
Across all operations, Polaris maintains a clear separation of concerns. The entity model defines the logical structure of the catalog, the authorization layer governs access, and the persistence layer records catalog state. The Iceberg execution bridge handles metadata transitions, while the storage integration layer enforces physical access boundaries. These components operate together through a consistent execution model, regardless of whether the operation is a simple catalog read or a metadata commit.
A key design principle that surfaces throughout is the separation between catalog state and table state. Polaris does not attempt to own or replicate Iceberg metadata. Instead, it maintains references to that state while governing how it is accessed and evolved. In that sense, Polaris moves beyond the role of a catalog implementation. It is a system that standardizes how metadata systems are composed and governed.
If you are interested in reading deeply technical blogs like this, make sure to join Cloudera Community and learn what are we building for developers in our Developer Hub.