Developer Blogs

dipankartnt · ‎06-16-2026

Over the past year, we have seen an explosion of interest in AI agents. Whether it's coding assistants, research agents, or enterprise copilots, the common theme is that modern language models are increasingly capable of planning, reasoning, and invoking tools to accomplish complex tasks.

Screenshot 2026-06-16 at 1.18.05 PM.png

But reasoning alone is not enough. Recent discussions across the industry, including work from OpenAI and Anthropic, have highlighted a recurring challenge: enterprise agents are only as good as the context available to them. Producing reliable outcomes requires much more than generating SQL or writing code. Agents need access to business metadata, schemas, relationships between datasets, organizational knowledge, and repeatable execution patterns that guide how work should be performed. This is why I often describe the problem as an enterprise context problem rather than simply an LLM problem. I recently spoke about this here.

In an enterprise data platform, that context already exists in many forms.

Table schemas describe the structure of data
Catalogs expose metadata and ownership information
Lineage systems capture dependencies between datasets
Governance layers define who can access what
Organizations also maintain standard operating procedures for recurring tasks such as quality validation, incident response, and compliance checks

Together, these pieces form the context plane that allows agents to operate with confidence instead of guesswork. This becomes even more important when multiple agents collaborate. A planner can decompose a task into smaller units, but the specialized agents responsible for execution still need grounded context and well-defined responsibilities. Otherwise, they simply become multiple disconnected language models making independent assumptions.

To show this in action, let's build a practical multi-agent workflow for one of the most common problems in enterprise data platforms: investigating data quality issues. Using Agent Studio in Cloudera AI and Open Lakehouse (with Apache Iceberg) foundation, we will create an orchestrated system where specialized agents collaborate to execute a standardized data quality playbook, validate findings against live data, and produce evidence-backed conclusions grounded in the enterprise context available to them. Along the way, we will see how enterprise context, well-defined responsibilities, and orchestration enable agents to move beyond generic reasoning and perform repeatable, trustworthy work.

Use Case: Automated Data Quality Investigation

Data quality investigations are a routine part of operating modern data platforms. Imagine you are a data engineer who receives an alert that a business dashboard is showing unexpected numbers. The immediate assumption is often that the underlying data is corrupt, but in practice that's only one of many possibilities. The issue could stem from duplicate records, missing values, invalid timestamps, schema changes, delayed ingestion, or even an upstream transformation that silently introduced inconsistencies.

ChatGPT Image Jun 10, 2026 at 04_47_07 PM.png

Most organizations already have a standard playbook for these investigations. Engineers inspect the table schema, examine row counts, look for null values, identify duplicate records, validate key constraints, compare distributions, and run a series of SQL queries to determine whether a genuine quality issue exists. While the exact checks vary between organizations, the process itself is highly repetitive and follows a predictable pattern.

This makes it an ideal candidate for an agentic workflow. Rather than asking a LLM to reason about the entire problem, we can encode the investigation process into specialized agents with clearly defined responsibilities.

For this blog, we will use an airline dataset stored as Apache Iceberg tables and implement a simple eight-check validation playbook. The objective is not to build a comprehensive data quality framework, but to demonstrate how enterprise context and repeatable procedures can be incorporated into an agentic workflow to automate a common operational task while keeping every conclusion grounded in evidence.

Designing the Multi-Agent Workflow

At first glance, this entire use case could be implemented as a single agent. Given a natural language request, the agent could inspect the schema, generate SQL, execute queries, interpret the results, and produce a report.

In practice, however, combining all of these responsibilities into one component makes the system harder to reason about and significantly more difficult to validate. Retrieval, execution, and reasoning are fundamentally different tasks, each requiring different context and different constraints.

Instead, we decompose the workflow into specialized agents with clearly defined responsibilities.

ChatGPT Image Jun 10, 2026 at 04_52_50 PM.png

The Workflow Manager Agent acts as the orchestrator. It receives the user's request, coordinates the execution sequence, and delegates work to the appropriate specialist rather than performing domain-specific analysis itself. Its role is simply to ensure that the right task is delegated to the right specialist at the right time.

The Data Warehouse Query Specialist Agent is responsible for interacting with the enterprise data platform (lakehouse). Rather than producing business conclusions, it focuses exclusively on executing governed, read-only SQL queries against Apache Iceberg tables through the Iceberg MCP server and using Impala as the query engine. For our use case, the organization's Data Quality Engineering team has already established a standardized eight-check validation playbook covering freshness, row counts, null profiling, duplicate detection, referential integrity, and delay sanity checks. The Query Specialist executes this playbook and returns factual evidence.

Finally, the Root Cause Analysis (RCA) Agent consumes the SQL results produced by the Query Specialist and synthesizes them into an investigation report. Importantly, it does not generate SQL or infer information that was never retrieved. Its conclusions are grounded entirely in the evidence collected during execution, allowing the final report to remain explainable and traceable back to the underlying data.

This separation of responsibilities is intentional. Rather than asking one general-purpose agent to perform every step of the workflow, each agent specializes in a narrow domain while relying on the others for complementary capabilities. The result is a system that is easier to extend, easier to test, and more aligned with how enterprise teams already operate. Viewed through the lens of enterprise AI, the goal is not simply to build multiple agents. It is to isolate responsibilities while ensuring that each agent operates with the right context, the right tools, and clearly defined boundaries.

Implementing the Workflow in Cloudera AI Agent Studio

With the use case defined, the next step is translating it into an executable workflow inside Cloudera AI Agent Studio. Rather than building a single all-knowing agent, we intentionally split the responsibility across specialized components with clearly defined roles and boundaries.

To illustrate the interaction between these components, consider a user asking the system to investigate potential data quality issues in an Apache Iceberg table. The Workflow Manager first interprets the request and delegates execution to the Query Specialist. The Query Specialist then retrieves the necessary data and executes the organization's standardized validation playbook against the lakehouse, returning structured evidence rather than conclusions. The RCA Agent then consumes those results, synthesizes the findings, and produces a report grounded entirely in the collected evidence. At no point does the RCA Agent generate SQL or speculate beyond what was actually observed during execution.

Let’s see this in action.

Step 1: Launch Cloudera Agent Studio and Create a Workflow

Screenshot 2026-06-10 at 5.19.53 PM.png

Screenshot 2026-06-10 at 5.21.43 PM.png

The first step is creating a new Workflow to start our Agent development. Cloudera Agent Studio allows you to create and save workflows to be reused in future tasks.

Step 2: Enable & Configure the Manager Agent

The next step presents the workflow canvas where agents and tasks can be configured. Before adding individual agents, enable the Manager Agent option at the top of the page.

Screenshot 2026-06-10 at 5.28.40 PM.png

For this example, we configure a custom manager named Airline Data Quality Workflow Manager with the role Workflow Orchestrator. Rather than participating in the investigation itself, its responsibility is simply to coordinate execution across the workflow. The backstory and goal instruct the manager to delegate every incoming data quality request to the Data Warehouse Query Specialist, wait for the SQL validation results, forward those results to the Root Cause Analysis Agent, and finally return the completed RCA report.

Screenshot 2026-06-10 at 5.35.04 PM.png

Keeping the manager focused solely on orchestration is an important design choice. It does not generate SQL, analyze findings, or produce conclusions. Instead, it acts as the control plane for the workflow, ensuring that evidence is gathered before any reasoning takes place and that each specialized agent operates within its intended responsibility.

Step 3: Add the Data Warehouse Query Agent and Iceberg MCP Server

Once we have the Manager Agent setup, let’s start creating our two specialized agents. Click on “Add your First Agent”

Remember that this agent serves as the execution layer of the workflow and is the only component responsible for interacting with the underlying data platform.

As shown below, we configure the agent with an appropriate Name, Role, Backstory, and Goal. Rather than giving the language model unrestricted freedom, the backstory constrains the agent to a narrowly defined responsibility: execute read-only SQL against the Apache Iceberg airline tables through the Iceberg MCP server and return factual query results. It is explicitly instructed not to perform root cause analysis or summarize business impact, leaving those responsibilities to downstream agents.

Screenshot 2026-06-11 at 11.30.27 AM.png

The goal further reinforces this behavior by directing the agent to execute a standardized eight-check data quality validation playbook and return both the SQL statements executed and their actual results.

By ensuring that one specialized agent focuses exclusively on data retrieval while another interprets the evidence, we reduce the likelihood of unsupported conclusions and make the overall system easier to validate and maintain.

In the following step, we will connect this agent to the Iceberg MCP server, enabling it to securely execute SQL against the underlying Open Lakehouse.

The final piece required by the Query Specialist is access to the underlying data platform (lakehouse). We will use the Iceberg MCP server from this Github repository to add it to our Agent’s tool. This particular server provides read-only access to Iceberg tables via Apache Impala. It enables LLMs to inspect database schemas and execute read-only queries via these functions:

execute_query(query: str): Run any SQL query on Impala and return the results as JSON
get_schema(): List all tables available in the current database

Screenshot 2026-06-11 at 11.45.48 AM.png

In our workflow, the Data Warehouse Query Specialist uses these tools to execute the predefined data quality validation playbook and return factual evidence that can later be interpreted by the Root Cause Analysis Agent.

Step 4: Add the Root Cause Analysis Agent

With the Query Specialist responsible for interacting with the data platform, we can now create a second specialized agent dedicated exclusively to interpretation. As shown below, we define a Root Cause Analysis Agent whose responsibility is to consume the SQL evidence returned by the Query Specialist and transform it into a structured investigation report.

Screenshot 2026-06-11 at 11.54.25 AM.png

Its backstory and goal explicitly constrain it to reason only over the evidence it receives, producing an RCA report that summarizes the validation checks performed, confirmed findings, severity, likely causes, and recommended next steps. Again, by isolating data retrieval from analysis, the workflow ensures that conclusions remain grounded in actual query results rather than model assumptions.

And this is how our workflow looks like now.

Screenshot 2026-06-11 at 1.33.52 PM.png

Step 5: Add Task

With the agents in place, the final piece is defining the Task that ties the workflow together. As shown below, the task accepts a natural language input ({NLQ}) and specifies the expected output of the workflow.

Screenshot 2026-06-11 at 1.36.19 PM.png

The task describes the objective at a high level and leaves the execution strategy to the specialized agents we configured earlier. In our case, the task simply instructs the workflow to investigate a data quality issue and expects a concise root cause analysis report grounded in the results returned by the Data Warehouse Query Specialist. So, the task defines what needs to be accomplished, while the agents define how it should be accomplished.

The Workflow Manager interprets the request, delegates execution to the Query Specialist, waits for the SQL evidence to be collected through the Iceberg MCP server, and then forwards that evidence to the Root Cause Analysis Agent before returning the final report to the user.

Screenshot 2026-06-11 at 1.39.49 PM.png

Step 6: Configure Runtime Settings and MCP Connection

Before testing the workflow, there are a few runtime settings that should be configured. The Max New Tokens parameter defines the maximum length of the responses that agents can generate during execution. Since our Data Warehouse Query Specialist may need to return multiple SQL statements and corresponding results, and the Root Cause Analysis Agent produces a structured investigation report, it is generally advisable to allocate a sufficiently medium token budget. In this example, we use a value of 4096.

The Temperature setting controls how deterministic the language model behaves. Higher values encourage more varied and creative responses, while lower values produce more consistent outputs. For operational workflows such as SQL validation and root cause analysis, reproducibility is typically preferred over creativity, so using a relatively low temperature is a sensible choice.

Screenshot 2026-06-11 at 1.45.34 PM.png

Finally, the Iceberg MCP server requires connection information for the underlying Impala cluster. Under Tools and MCPs, populate the required environment variables such as the Impala host, port, and authentication credentials. Once configured, these values are securely supplied to the MCP server whenever the Data Warehouse Query Specialist invokes its tools. At this point, the workflow is fully assembled.

Step 7: Execute a Data Quality Investigation

With the workflow fully configured, we are ready to execute our first investigation. Agent Studio exposes a simple natural language interface where users can describe the task they want to perform.

For this example, we use the following prompt:

Investigate data quality issues in airlines.flights_iceberg and produce a root cause analysis report based on the standard validation playbook.

Once submitted, the Workflow Manager interprets the request and delegates execution to the Data Warehouse Query Specialist, which invokes the Iceberg MCP server to run the predefined validation playbook against the Apache Iceberg tables. Importantly, the agent is not free to run arbitrary analysis - it operates within predefined governance boundaries, using standardized checks, and controlled access to the underlying data platform. After collecting the SQL statements and their results, those findings are forwarded to the Root Cause Analysis Agent, which synthesizes the evidence into a structured RCA report.

Here is a snippet of the output PDF.

Screenshot 2026-06-11 at 2.30.10 PM.png

Final Thoughts

While the use case in this blog focused on data quality, the broader message is about architecture. Building reliable AI agents requires more than a capable language model - it requires an integrated platform where models, enterprise data, governance, and execution frameworks work together seamlessly.

In many enterprises, the open lakehouse serves as the long-term context layer for AI agents, capturing the organization's structured knowledge in the form of Apache Iceberg tables, historical data, and operational state. Catalogs complement this by exposing metadata, schemas, and relationships that help agents discover and reason about that context. By connecting Cloudera AI to this governed foundation through the Iceberg MCP server, our workflow was able to retrieve live evidence and ground every conclusion in actual enterprise data rather than model assumptions.

Cloudera Agent Studio made it straightforward to compose and orchestrate a multi-agent workflow using specialized roles with clear responsibilities. As enterprise AI applications continue to evolve, this combination of governed data access, specialized agents, and open standards provides a practical blueprint for building trustworthy, production-ready agentic systems.

If you are interested in learning more, check out:

Cloudera Community

Developer Blogs

Building a Multi-Agent Data Quality Workflow inside the Cloudera AI Lakehouse (ft. Apache Iceberg)