Profile

Luthfi Balaka

PhD Student in Computer Science
The University of Chicago

I am a first-year PhD student in Computer Science and an Eckhardt Graduate Scholar at the University of Chicago. My research focuses on systems for data discovery and data integration. I am currently working on the Pneuma project, where I design Pneuma-Seeker, a system that represents users' information needs over tabular data as a relational data model. This allows both the user and the system to iteratively refine the data model, enabling users to react to and guide the system while it progressively works toward fulfilling their information needs.

Latest Updates

The Pneuma project paper is accepted at CIDR 2026! Oct 2025

I start my doctoral research at UChicago as part of the Summer Research Arrival program! Jun 2025

Pneuma is accepted at SIGMOD 2025! Jan 2025

I accept UChicago's CS PhD offer! Jan 2025

I start my research collaboration at UChicago! Feb 2024

Publications

Demonstration of Pneuma-Seeker: Agentic System for Reifying and Fulfilling Information Needs on Tabular Data - Under Review at CAIS 2026

Luthfi Balaka, Raul Castro Fernandez

We demonstrate Pneuma-Seeker on two real-world procurement use cases.

Pneuma-Seeker: Relational Reification of Information Needs for Agentic Data Discovery and Preparation - Under Review at VLDB 2026

Luthfi Balaka, John Hillesland, Kemal Badur, Raul Castro Fernandez

We present Pneuma-Seeker, a system that represents a user’s evolving information need as a relational schema and iteratively refines it to discover, prepare, and integrate relevant data sources to compute answers.

The Pneuma Project: Reifying Information Needs as Relational Schemas to Automate Discovery, Guide Preparation, and Align Data with Intent - CIDR 2026

Luthfi Balaka, Raul Castro Fernandez

We describe our vision for the Pneuma project and introduce a preliminary version of Pneuma-Seeker, an agentic system that reifies a user's evolving information need as a relational data model, helping the user articulate and fulfill their information need through iterative interaction.

Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System - SIGMOD 2025

Luthfi Balaka, David Alexander, Qiming Wang, Yue Gong, Adila Krisnadhi, Raul Castro Fernandez

We introduce Pneuma, an LLM-powered data discovery system for tabular data. Given a natural language query, Pneuma searches an indexed collection and retrieves the most relevant tables for the question. It performs this search by leveraging both content (columns and rows) and context (metadata) to match tables with questions.

Education

The University of Chicago
Jun 2025 - Present

PhD in Computer Science

Eckhardt Graduate Scholar

University of Indonesia
Aug 2020 - Dec 2024

Bachelor of Computer Science, cum laude

Merit-based full academic scholarship recipient