Autonomous agents have met their biggest challenge yet: The database.

AI agents can build B+ trees and buffer managers, but CMU's Andy Pavlo says the query optimizer and autonomous database remain their toughest unsolved challenge.

Jun 4th, 2026 2:53pm by Chris J. Preimesberger

Featued image for: Autonomous agents have met their biggest challenge yet: The database.

Photo by Logan Voss on Unsplash

As large language models evolve from mere chatbots into autonomous agents capable of reasoning, planning, and acting, they are beginning to orchestrate complex application stacks on their own.

However, these agents are now encountering their most formidable obstacle: the database.

*Andy Pavlo. Credit: Carnegie-Mellon University*

“Databases pose the hardest and most important challenge for agents, due to their unforgiving correctness and performance requirements,” Andy Pavlo, Associate Professor of Computer Science at Carnegie Mellon University, told attendees last week at the Percona Live 2026 conference here in Mountain View, California, at the Computer History Museum.

In a discussion on the intersection of AI and open-source infrastructure, Pavlo contended that while coding agents can readily regurgitate standard data structures, the database remains the most difficult part of any system to automate and optimize.

“For example, if an agent hallucinates a UI component, the page looks slightly off; if it hallucinates a query or a configuration change in a production database, the entire system can vanish,” Pavlo says.

Now THAT would be a cause for alarm.

The multi-agent tug-of-war

Pavlo identifies two primary ways AI is impacting the database world: tuning agents and coding agents. Tuning agents aim to solve the “black magic” of database optimization — automatically adjusting system knobs, physical designs (such as indexes), and query execution strategies. Historically, this required a human database administrator (DBA) to spend years developing the intuition to know which configuration would yield better latency or throughput.

“If an agent hallucinates a UI component, the page looks slightly off; if it hallucinates a query or a configuration change in a production database, the entire system can vanish.”

The challenge is that these specialized agents often operate in silos, Pavlo said. A knob-tuning agent might be unaware of what an index-tuning agent is doing, leading to local minima where the system is better than stock but far from optimal. CMU’s research into multi-round and sequential tuning aims to solve this by creating a coordinating framework, though even this faces a “curse of dimensionality,” Pavlo says.

Carnegie Mellon’s Database Group pioneered the concept of self-driving and machine-learning-driven database optimization. Sequential tuning and multi-round tuning are prime components of their autonomous database management system (DBMS) projects.

Multi-round and sequential tuning in AI databases refers to advanced machine learning and data engineering methods in which AI models are refined for multistep reasoning, tool use, or complex conversational histories. These frameworks ensure that AI models not only respond in isolated single-turn bursts but maintain context and logic across complex interactions.

With trillions of possible configuration combinations, the search space for a perfect database is effectively exponential.

The coding agent advantage and the optimizer wall

On the development side, coding agents are already proving to be hyper-productive collaborators. Pavlo observed that at CMU, student submissions for database projects saw a massive spike in lines of code once LLMs were permitted. “The coding agents are very good at building almost every part of a database — B+ trees, hash tables, buffer managers — because they can regurgitate standard implementations found in textbooks and open-source repos,” Pavlo said.

However, the “double black diamond” challenge, Pavlo said, remains the query optimizer. Unlike basic data structures, query optimizers are rarely available as clean, modular open-source references. They are often deeply entangled with the systems for which they were built. Furthermore, proving that an AI-generated transformation rule is semantically correct — meaning it produces the same result as the original query but faster — is an unsolved problem.

Risks include hallucinations and security

The shift toward agentic database management isn’t without significant risk. Pavlo and other industry leaders, such as Percona co-founder Peter Zaitsev, warn that delegating orchestration to agents introduces massive stability and security gaps. There are already documented cases of agents being pointed at a database and accidentally dropping the entire system or leaking sensitive information because they didn’t understand the nuance of access controls, Zaitsev said.

Furthermore, LLMs suffer from so-called AI slop, in which they generate code that is hyper-specialized to a specific query but fails to generalize. For example, if a developer uses an agent to optimize an “Extract Year” clause, the agent might build an internal data structure that breaks the moment the developer tries to enact “Extract Month.”

Automation as a collaborator, not a replacement

Despite these hurdles, Pavlo said he is optimistic about the Agent Operator model. This envisions agents handling the “3 a.m. s***’s on fire” situations — immediate performance anomalies and stability issues — while humans focus on higher-level architectural design. By using Agent Boosting techniques to bootstrap training data from previously tuned databases, the time required to optimize a system can be cut from 12 hours to under 15 minutes, Pavlo said.

In the new AI era, the goal isn’t only to have an AI that writes code, but a system that can reason about its own performance and correctness. Pavlo concludes that the database is the foundation of knowledge for any agent. “If we want autonomous systems, we must first master the unforgiving art of the autonomous database,” he says.

“If we want autonomous systems, we must first master the unforgiving art of the autonomous database.”

Chris J. Preimesberger, a contributing writer/editor at several publications since June 2021, is former editor in chief of eWEEK. He was responsible for the publication's coverage for a decade (2011-2021). In his 16 years and more than 5,000 articles at...