Workshop on Systems for Data-centric Agents with Human-in-the-loop
DASHSys
Co-located with VLDB 2026, Boston, USA
Sep 4, 2026 (Friday)

Large language models (LLMs) and autonomous agents are reshaping how data systems are designed and used. Modern AI applications increasingly operate as compound systems that integrate reasoning, retrieval, planning, and tool use over heterogeneous and continuously evolving data ecosystems. These systems must interact with structured and unstructured data, knowledge graphs, multimodal content, and external tools while balancing trade-offs in cost, latency, scalability, robustness, governance, and trust. Effectively deploying agentic AI in such environments requires a systems-driven approach grounded in data management principles and human oversight. Human-in-the-loop methodologies remain central to alignment, evaluation, debugging, and lifecycle management of evolving agent workflows. Ensuring that agents remain reliable, interpretable, and aligned with human intent is as much a data systems challenge as it is a modeling challenge. DASHSys is a full-day workshop dedicated to advancing the foundations, architectures, and evaluation of data-centric agentic systems.

Announcements

  • We are excited to announce the Workshop on Systems for Data-centric Agents with Human-in-the-loop @VLDB 2026 (DASHSys).

Keynote Speakers and Panelists

Erkang (Eric) Zhu is a Senior Staff Researcher at Alibaba, where he leads work on AI agent systems — from exploring the upper-bound capabilities of LLMs through novel applications to building production-ready sandbox runtimes and training platforms. His current work includes CoPaw, a project exploring how to build self-improving and secure personal assistants. His team's projects are open-source under AgentScope. Previously, as a Principal Researcher at Microsoft Research, Eric was a key contributor to AutoGen, a widely adopted open-source framework for building multi-agent applications that orchestrate reasoning, tool use, and human-in-the-loop collaboration. He also helped build the Microsoft Agent Framework and the Azure AI Foundry Agent Platform for deploying agentic workflows at scale. Eric's work sits at the intersection of AI agents and data systems — before working on agents, he built optimized query execution engines for pattern search in SQL at Microsoft Research, and his PhD research at the University of Toronto focused on dataset search over massive Open Data archives, contributing scalable algorithms for set similarity search and data sketches. This arc from data management foundations to agentic AI systems gives him a unique perspective on the systems-driven challenges of building reliable, scalable, and human-aligned agent architectures.


Fatma Özcan is a Principal Engineer at Systems Research@Google. Her current research focuses on GenAI and data management, vector search, platforms and infrastructure for large-scale data analysis, and natural language interfaces to data. Dr Özcan earned her PhD in computer science from the University of Maryland, College Park, and her BSc in computer engineering from METU, Ankara. Before joining Google, she was a Distinguished Research Staff Member and a senior manager at the IBM Almaden Research Center. She has over 24 years of experience in industrial research, and has delivered core technologies into various IBM and Google products. She is the co-author of the book "Heterogeneous Agent Systems" and co-author of several conference and journal papers, as well as patents. She is an ACM Fellow, serves on the CRA board of directors, and is the co-chair of CRA-Industry. She received the VLDB Women in Database Research Award in 2022.


Omar Khattab is an Assistant Professor at MIT's Department of Electrical Engineering and Computer Science (EECS) and a PI at MIT CSAIL. Omar's research develops models, algorithms, and abstractions for building reliable and scalable AI systems. Among other efforts, he created the ColBERT retrieval model, which has helped shape the modern landscape of neural information retrieval, and the DSPy framework for building and optimizing language model programs. His lines of work on ColBERT, DSPy, GEPA, and RLMs form the basis of influential open-source projects, together downloaded millions of times per month, and have sparked applications at dozens of organizations. Omar’s Ph.D. at Stanford CS was supported by the Apple Scholars in AI/ML PhD Fellowship and his work received a Best Paper Award at SIGIR 2025.


Eugene Wu is an associate professor of computer science at Columbia University and Co-director of the new Data, Agents, and Processes Lab (DAPLab). He is broadly interested in the foundations of computing infrastructure that are needed in a future where AI agents can safely, reliably, and efficiently automate complex work. His research spans the computing stack, from visualization and HCI to core data systems. Eugene has received the VLDB 2018 10-year test of time award, best-of-conference citations at ICDE and VLDB, the NSF CAREER, and Google, Adobe, and Amazon faculty awards.


Shreya Shankar is a final-year PhD student in the Data Systems and Foundations group at UC Berkeley, advised by Dr. Aditya Parameswaran. She is broadly interested in data systems, large language models, and human-computer interaction. Her PhD has been supported by an NDSEG Fellowship and a Bridgewater Research Fellowship, and her work has been recognized with several awards, including 2025 EECS Rising Stars, a best paper honorable mention at UIST 2025, and a best paper award at CHI 2026. Before her PhD, Shreya worked as the first data/ML engineer at a startup after her undergraduate degree in CS at Stanford.


Juliana Friere is an Institute Professor at the Tandon School of Engineering and Professor of Computer Science and Data Science at New York University, where she co-directs the Visualization Imaging and Data Analysis (VIDA) Center. Her research develops methods and systems that enable a wide range of users to obtain trustworthy insights from data. It spans large-scale data analysis and integration, visualization, AI/machine learning, provenance management, and information discovery, and addresses application areas such as urban analytics, computational reproducibility, biomedical data harmonization, and AI for science. She has co-authored over 250 papers, including 12 award winners and a Test of Time Award. She served as elected chair of ACM SIGMOD and as a council member of the Computing Community Consortium (CCC), and was the NYU lead investigator for the Moore-Sloan Data Science Environment. She is a Fellow of the ACM and AAAS, and a winner of the ACM SIGMOD Contributions Award. Her work has been supported by funding agencies and industry partners, including the National Science Foundation, DARPA, ARPA-H, the Department of Energy, the National Institutes of Health, and technology companies such as Google, Amazon, Microsoft Research, and IBM. Freire received her Ph.D. and M.Sc. degrees in computer science from the State University of New York at Stony Brook, and her B.S. in computer science from the Federal University of Ceará, Brazil.



Program

Location: TBD

Coming soon...

Call for Papers

DASHSys invites original research contributions, system papers, position papers, and real-world system reports at the intersection of:

We welcome submissions that advance the theory, systems, and practice of building data-aware, agent-driven, and human-aligned AI systems.

Topics of Interest

Topics include (but are not limited to):

Submission Categories

Submissions should present original results and substantial new work not currently under review or published elsewhere. Manuscripts must be prepared following the same rules as VLDB conference papers. Papers must be submitted via the workshop's submission system in PDF format. Demo-oriented submissions are strongly encouraged. Artifact availability (code, datasets, system demos) is highly encouraged

Evaluation Criteria and Reviewing Process

DASHSys will follow a double-anonymous review process. Each paper will be evaluated based on relevance, originality, technical quality, and clarity. Reviewers will be instructed to provide constructive feedback. Accepted papers will appear in the official VLDB workshop proceedings.


The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.

Important Dates


Organization

General Chairs

Submission Chairs

Systems Track Chairs

Keynotes, Panels, and Session Chairs

Website and Publicity Chairs

Steering Committee

Program Committee


Sponsors