TMoW: Test-Time Mixture of World Models

Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments (ICLR 2026)

Overview

TMoW is a novel framework that enables embodied agents to dynamically adapt to unseen domains at test time without requiring costly retraining. Unlike traditional Mixture-of-Experts (MoE) architectures with fixed routing functions, TMoW performs test-time training of routing functions by leveraging world models as internal simulators.

Key Features

Multi-granular Prototype-based Router: Adapts world model mixtures by comparing input observations with learned prototype representations across different levels of spatial abstraction (from local objects to global scenes)
Test-time Prototype Refinement: Refines prototypes through weighted interpolation between existing prototypes based on their similarity to the current environment, enabling zero-shot adaptation to unseen domains
Distilled Mixture-based Model Augmentation: Supports data-efficient creation of new world models by distilling knowledge from existing model mixtures using few-shot demonstrations

Architecture

                    ┌─────────────────────────────────────┐
                    │       Multi-granular Prototype      │
                    │          based Router               │
                    │  ┌─────┐  ┌─────┐  ┌─────┐          │
    Observation ───►│  │ GCN │─►│ GCN │─►│ GCN │─►...     │
    + Instruction   │  └──┬──┘  └──┬──┘  └──┬──┘          │
                    │     │        │        │             │
                    │  Layer 1  Layer 2  Layer N          │
                    │  Routing  Routing  Routing          │
                    └─────┼────────┼────────┼─────────────┘
                          │        │        │
                          ▼        ▼        ▼
                    ┌─────────────────────────────────────┐
                    │    Mixture of World Models (MoW)    │
                    │  ┌───────┐ ┌───────┐ ┌───────┐      │
                    │  │Expert │ │Expert │ │Expert │ ...  │
                    │  │  1    │ │  2    │ │  N    │      │
                    │  │(LoRA) │ │(LoRA) │ │(LoRA) │      │
                    │  └───────┘ └───────┘ └───────┘      │
                    │         Base LLM (Frozen)           │
                    └─────────────────────────────────────┘

Installation

# Clone the repository
git clone https://github.com/doldam0/tmow.git
cd tmow

# Install dependencies with uv (Python 3.12+)
uv sync

Environment Setup

VirtualHome

uv pip install -e tmow/environments/virtualhome

ALFWorld

ALFWorld is automatically installed as a dependency. Make sure to set up the required data:

export ALFWORLD_DATA=/path/to/alfworld/data

Usage

Training Expert World Models

First, train individual domain-specific expert models:

uv run tmow train expert --config configs/train_expert_config.py

Training the Router (MoW)

Train the multi-granular prototype-based router:

uv run tmow train --config configs/train_mow_config.yaml

Evaluation

Evaluate on VirtualHome

uv run tmow eval virtualhome \
    --model_path /path/to/mow/model \
    --domain_type seen \
    --task_type seen

Evaluate on ALFWorld

uv run tmow eval alfworld \
    --model_path /path/to/mow/model \
    --dataset_path /path/to/eval/dataset

Few-shot Expansion

Expand the model to new domains with few-shot demonstrations:

uv run tmow expand \
    --config configs/train_mow_config.yaml \
    --datasets /path/to/new/domain/data \
    --num_samples 10 \
    --output_path /path/to/expanded/model

Project Structure

tmow/
├── tmow/
│   ├── cli/              # Command-line interface
│   ├── common/           # Common utilities (data, graph, trainer)
│   ├── dataset/          # Dataset builders (VirtualHome, ALFWorld, etc.)
│   ├── environments/     # Environment wrappers
│   │   └── virtualhome/  # VirtualHome simulator
│   ├── modules/          # Core model components
│   │   ├── mow.py        # Mixture of World Models
│   │   ├── routers.py    # Graph-based router
│   │   ├── gcn.py        # Graph Convolutional Network
│   │   └── mlp.py        # MLP modules
│   ├── scripts/          # Training and evaluation scripts
│   └── utils/            # Utility functions
├── configs/              # Configuration files
└── tests/                # Unit tests

Key Components

MoW (Mixture of World Models)

The core model that combines multiple domain-specific world models (implemented as LoRA adapters) with a prototype-based routing mechanism. See tmow/modules/mow.py.

GraphRouter

A multi-layer GCN-based router that computes routing scores by comparing input graph embeddings with domain prototypes at multiple granularity levels. See tmow/modules/routers.py.

Test-time Prototype Refinement

During inference, the router can be refined using the refine_router() method, which updates prototypes based on similarity to the current environment without requiring gradient updates.

Experimental Results

TMoW demonstrates significant improvements over baselines:

27.21% improvement over SayCanPay in zero-shot adaptation scenarios
25.66% gain in few-shot expansion when constructing new world models

Evaluated on:

VirtualHome
ALFWorld
RLBench
Real-world robotic scenarios

Citation

@inproceedings{tmow2026,
  title={Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments},
  author={Jinwoo Jang, Minjong Yoo, Sihyung Yoon, Honguk Woo},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
docs		docs
tests		tests
tmow		tmow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TMoW: Test-Time Mixture of World Models

Overview

Key Features

Architecture

Installation

Environment Setup

VirtualHome

ALFWorld

Usage

Training Expert World Models

Training the Router (MoW)

Evaluation

Evaluate on VirtualHome

Evaluate on ALFWorld

Few-shot Expansion

Project Structure

Key Components

MoW (Mixture of World Models)

GraphRouter

Test-time Prototype Refinement

Experimental Results

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TMoW: Test-Time Mixture of World Models

Overview

Key Features

Architecture

Installation

Environment Setup

VirtualHome

ALFWorld

Usage

Training Expert World Models

Training the Router (MoW)

Evaluation

Evaluate on VirtualHome

Evaluate on ALFWorld

Few-shot Expansion

Project Structure

Key Components

MoW (Mixture of World Models)

GraphRouter

Test-time Prototype Refinement

Experimental Results

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages