Reasoning with Machines Lab hero image

Welcome to the Reasoning with Machines Lab @ University of Oxford

Led by Prof. Adam Mahdi, our lab advances the science of AI evaluation, benchmarking, safety and security. Through rigorous empirical research, we study how LLMs and agentic systems reason, interact with humans and drive scientific discovery.

Research Themes

Benchmarks and Evaluation

We develop the science of LLM evaluation, setting the standard for rigorous assessment and identifying hidden risks before they matter.

AI Safety and Security

From bias and toxicity to agentic misalignment, we study the full spectrum of AI risk and develop the technical and governance tools to address it.

Agentic AI for Science

We build agentic systems that automate scientific knowledge synthesis and discovery, with a focus on agents that are reliable, transparent and domain-grounded.

Human-AI Interaction

We run large-scale empirical studies on how people use AI for high stakes decisions, from healthcare and law to policy and beyond.