TalkRL: The Reinforcement Learning Podcast

Joseph Modayil of Openmind Research Institute @ RLC 2025

Robin Ranjit Singh Chauhan — Fri, 02 Jan 2026 21:00:00 -0800

Joseph Modayil is the Founder, President & Research Director of Openmind Research Institute.

Featured References

Openmind Research Institute

The Alberta Plan for AI Research
Richard S. Sutton, Michael Bowling, Patrick M. Pilarski

Additional References

Danijar Hafner on Dreamer v4

Robin Ranjit Singh Chauhan — Sun, 09 Nov 2025 23:00:00 -0800

Danijar Hafner was a Research Scientist at Google DeepMind until recently.

Featured References

Training Agents Inside of Scalable World Models [ blog ]
Danijar Hafner, Wilson Yan, Timothy Lillicrap

One Step Diffusion via Shortcut Models
Kevin Frans, Danijar Hafner, Sergey Levine, Pieter Abbeel

Action and Perception as Divergence Minimization [ blog ]
Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess

Additional References

Mastering Diverse Domains through World Models [ blog ] DreaverV3l Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap
Mastering Atari with Discrete World Models [ blog ] DreaverV2 ; Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
Dream to Control: Learning Behaviors by Latent Imagination [ blog ] Dreamer ; Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [ Blog Post ], Baker et al

David Abel on the Science of Agency @ RLDM 2025

Robin Ranjit Singh Chauhan — Mon, 08 Sep 2025 10:34:40 -0700

David Abel is a Senior Research Scientist at DeepMind on the Agency team, and an Honorary Fellow at the University of Edinburgh. His research blends computer science and philosophy, exploring foundational questions about reinforcement learning, definitions, and the nature of agency.

Featured References

Plasticity as the Mirror of Empowerment
David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

A Definition of Continual RL
David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh

Agency is Frame-Dependent
David Abel, André Barreto, Michael Bowling, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

On the Expressivity of Markov Reward
David Abel, Will Dabney, Anna Harutyunyan, Mark Ho, Michael Littman, Doina Precup, Satinder Singh — Outstanding Paper Award, NeurIPS 2021

Additional References

Bidirectional Communication Theory — Marko 1973
Causality, Feedback and Directed Information — Massey 1990
The Big World Hypothesis — Javed et al. 2024
Loss of plasticity in deep continual learning — Dohare et al. 2024
Three Dogmas of Reinforcement Learning — Abel 2024
Explaining dopamine through prediction errors and beyond — Gershman et al. 2024
David Abel Google Scholar
David Abel personal website

Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025

Robin Ranjit Singh Chauhan — Tue, 19 Aug 2025 00:24:47 -0700

Recorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada.

Featured References

Lecture on the Oak Architecture, Rich Sutton

Alberta Plan, Rich Sutton with Mike Bowling and Patrick Pilarski

Additional References

Jacob Beck on Google Scholar
Alex Goldie on Google Scholar
Cornelius Braun on Google Scholar
Reinforcement Learning Conference

Outstanding Paper Award Winners - 2/2 @ RLC 2025

Robin Ranjit Singh Chauhan — Sun, 17 Aug 2025 23:00:00 -0700

We caught up with the RLC Outstanding Paper award winners for your listening pleasure.

Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.

Featured References

Empirical Reinforcement Learning Research
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Biyik, Joseph J Lim

Applications of Reinforcement Learning
WOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management Strategies
William Solow, Sandhya Saisubramanian, Alan Fern

Emerging Topics in Reinforcement Learning
Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners
Calarina Muslimani, Kerrick Johnstonbaugh, Suyog Chandramouli, Serena Booth, W. Bradley Knox, Matthew E. Taylor

Scientific Understanding in Reinforcement Learning
Multi-Task Reinforcement Learning Enables Parameter Scaling
Reginald McLean, Evangelos Chatzaroulas, J K Terry, Isaac Woungang, Nariman Farsad, Pablo Samuel Castro

Outstanding Paper Award Winners - 1/2 @ RLC 2025

Robin Ranjit Singh Chauhan — Fri, 15 Aug 2025 12:13:19 -0700

We caught up with the RLC Outstanding Paper award winners for your listening pleasure.

Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.

Featured References

Scientific Understanding in Reinforcement Learning
How Should We Meta-Learn Reinforcement Learning Algorithms?
Alexander David Goldie, Zilin Wang, Jakob Nicolaus Foerster, Shimon Whiteson

Tooling, Environments, and Evaluation for Reinforcement Learning
Syllabus: Portable Curricula for Reinforcement Learning Agents
Ryan Sullivan, Ryan Pégoud, Ameen Ur Rehman, Xinchen Yang, Junyun Huang, Aayush Verma, Nistha Mitra, John P Dickerson

Resourcefulness in Reinforcement Learning
PufferLib 2.0: Reinforcement Learning at 1M steps/s
Joseph Suarez

Theory of Reinforcement Learning
Deep Reinforcement Learning with Gradient Eligibility Traces
Esraa Elelimy, Brett Daley, Andrew Patterson, Marlos C. Machado, Adam White, Martha White

Thomas Akam on Model-based RL in the Brain

Robin Ranjit Singh Chauhan — Sun, 03 Aug 2025 23:00:00 -0700

Prof Thomas Akam is a Neuroscientist at the Oxford University Department of Experimental Psychology. He is a Wellcome Career Development Fellow and Associate Professor at the University of Oxford, and leads the Cognitive Circuits research group.

Featured References

Brain Architecture for Adaptive Behaviour
Thomas Akam, RLDM 2025 Tutorial

Additional References

Thomas Akam on Google Scholar
pyPhotometry : Open source, Python based, fiber photometry data acquisition
pyControl : Open source, Python based, behavioural experiment control.
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nathaniel D Daw, Yael Niv, Peter Dayan, 2005
Further analysis of the hippocampal amnesic syndrome: 14-year follow-up study of H. M., Milner, B., Corkin, S., & Teuber, H. L., 1968
Internally generated cell assembly sequences in the rat hippocampus, Pastalkova E, Itskov V, Amarasingham A, Buzsáki G. Science. 2008
Multi-disciplinary Conference on Reinforcement Learning and Decision 2025

Stefano Albrecht on Multi-Agent RL @ RLDM 2025

Robin Ranjit Singh Chauhan — Tue, 22 Jul 2025 14:29:54 -0700

Stefano V. Albrecht was previously Associate Professor at the University of Edinburgh, and is currently serving as Director of AI at startup Deepflow. He is a Program Chair of RLDM 2025 and is co-author of the MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches".

Featured References

Multi-Agent Reinforcement Learning: Foundations and Modern Approaches

Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer

MIT Press, 2024

RLDM 2025: Reinforcement Learning and Decision Making Conference

Dublin, Ireland

EPyMARL: Extended Python MARL framework

https://github.com/uoe-agents/epymarl

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

Georgios Papoudakis and Filippos Christianos and Lukas Schäfer and Stefano V. Albrecht

Satinder Singh: The Origin Story of RLDM @ RLDM 2025

Robin Ranjit Singh Chauhan — Wed, 25 Jun 2025 08:48:11 -0700

Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of RLDM. Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).

Recorded on location at Trinity College Dublin, Ireland during RLDM 2025.

Featured References

RLDM 2025: Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)
June 11-14, 2025 at Trinity College Dublin, Ireland

Satinder Singh on Google Scholar

NeurIPS 2024 - Posters and Hallways 3

Robin Ranjit Singh Chauhan — Sun, 09 Mar 2025 14:25:53 -0700

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada.

Featuring

Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control
Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
Harley Wiltzer from MILA: Foundations of Multivariate Distributional Reinforcement Learning
Vinzenz Thoma from ETH AI Center: Contextual Bilevel Reinforcement Learning for Incentive Alignment
Haozhe (Tony) Chen & Ang (Leon) Li from Columbia: QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers

NeurIPS 2024 - Posters and Hallways 2

Robin Ranjit Singh Chauhan — Tue, 04 Mar 2025 16:03:16 -0800

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada.

Featuring

Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning
Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
Rory Young from University of Glasgow: Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Glen Berseth from MILA: Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Alexander Rutherford from University of Oxford: JaxMARL: Multi-Agent RL Environments and Algorithms in JAX

NeurIPS 2024 - Posters and Hallways 1

Robin Ranjit Singh Chauhan — Sun, 02 Mar 2025 20:53:38 -0800

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada.

Featuring

Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning
Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs
Soumyendu Sarkar of HP Labs : SustainDC: Benchmarking for Sustainable Data Center Control
Matteo Bettini of Cambridge University: BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
Michael Bowling of U Alberta : Beyond Optimism: Exploration With Partially Observable Rewards

Abhishek Naik on Continuing RL & Average Reward

Robin Ranjit Singh Chauhan — Sun, 09 Feb 2025 20:49:32 -0800

Abhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton. Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications.

Featured References

Reinforcement Learning for Continuing Problems Using Average Reward
Abhishek Naik Ph.D. dissertation 2024

Reward Centering
Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton 2024

Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan, Abhishek Naik, Richard S. Sutton 2020

Discounted Reinforcement Learning Is Not an Optimization Problem
Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton 2019

Additional References

Explaining dopamine through prediction errors and beyond, Gershman et al 2024 (proposes Differential-TD-like learning mechanism in the brain around Box 4)

Neurips 2024 RL meetup Hot takes: What sucks about RL?

Robin Ranjit Singh Chauhan — Mon, 23 Dec 2024 00:12:15 -0800

What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out!

Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.

Special thanks to "David Beckham" for the inspiration :)

RLC 2024 - Posters and Hallways 5

Robin Ranjit Singh Chauhan — Fri, 20 Sep 2024 07:40:04 -0700