Skip to content

doldam0/tmow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TMoW: Test-Time Mixture of World Models

Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments (ICLR 2026)

TMoW Architecture

Overview

TMoW is a novel framework that enables embodied agents to dynamically adapt to unseen domains at test time without requiring costly retraining. Unlike traditional Mixture-of-Experts (MoE) architectures with fixed routing functions, TMoW performs test-time training of routing functions by leveraging world models as internal simulators.

Key Features

  • Multi-granular Prototype-based Router: Adapts world model mixtures by comparing input observations with learned prototype representations across different levels of spatial abstraction (from local objects to global scenes)
  • Test-time Prototype Refinement: Refines prototypes through weighted interpolation between existing prototypes based on their similarity to the current environment, enabling zero-shot adaptation to unseen domains
  • Distilled Mixture-based Model Augmentation: Supports data-efficient creation of new world models by distilling knowledge from existing model mixtures using few-shot demonstrations

Architecture

                    ┌─────────────────────────────────────┐
                    │       Multi-granular Prototype      │
                    │          based Router               │
                    │  ┌─────┐  ┌─────┐  ┌─────┐          │
    Observation ───►│  │ GCN │─►│ GCN │─►│ GCN │─►...     │
    + Instruction   │  └──┬──┘  └──┬──┘  └──┬──┘          │
                    │     │        │        │             │
                    │  Layer 1  Layer 2  Layer N          │
                    │  Routing  Routing  Routing          │
                    └─────┼────────┼────────┼─────────────┘
                          │        │        │
                          ▼        ▼        ▼
                    ┌─────────────────────────────────────┐
                    │    Mixture of World Models (MoW)    │
                    │  ┌───────┐ ┌───────┐ ┌───────┐      │
                    │  │Expert │ │Expert │ │Expert │ ...  │
                    │  │  1    │ │  2    │ │  N    │      │
                    │  │(LoRA) │ │(LoRA) │ │(LoRA) │      │
                    │  └───────┘ └───────┘ └───────┘      │
                    │         Base LLM (Frozen)           │
                    └─────────────────────────────────────┘

Installation

# Clone the repository
git clone https://github.com/doldam0/tmow.git
cd tmow

# Install dependencies with uv (Python 3.12+)
uv sync

Environment Setup

VirtualHome

uv pip install -e tmow/environments/virtualhome

ALFWorld

ALFWorld is automatically installed as a dependency. Make sure to set up the required data:

export ALFWORLD_DATA=/path/to/alfworld/data

Usage

Training Expert World Models

First, train individual domain-specific expert models:

uv run tmow train expert --config configs/train_expert_config.py

Training the Router (MoW)

Train the multi-granular prototype-based router:

uv run tmow train --config configs/train_mow_config.yaml

Evaluation

Evaluate on VirtualHome

uv run tmow eval virtualhome \
    --model_path /path/to/mow/model \
    --domain_type seen \
    --task_type seen

Evaluate on ALFWorld

uv run tmow eval alfworld \
    --model_path /path/to/mow/model \
    --dataset_path /path/to/eval/dataset

Few-shot Expansion

Expand the model to new domains with few-shot demonstrations:

uv run tmow expand \
    --config configs/train_mow_config.yaml \
    --datasets /path/to/new/domain/data \
    --num_samples 10 \
    --output_path /path/to/expanded/model

Project Structure

tmow/
├── tmow/
│   ├── cli/              # Command-line interface
│   ├── common/           # Common utilities (data, graph, trainer)
│   ├── dataset/          # Dataset builders (VirtualHome, ALFWorld, etc.)
│   ├── environments/     # Environment wrappers
│   │   └── virtualhome/  # VirtualHome simulator
│   ├── modules/          # Core model components
│   │   ├── mow.py        # Mixture of World Models
│   │   ├── routers.py    # Graph-based router
│   │   ├── gcn.py        # Graph Convolutional Network
│   │   └── mlp.py        # MLP modules
│   ├── scripts/          # Training and evaluation scripts
│   └── utils/            # Utility functions
├── configs/              # Configuration files
└── tests/                # Unit tests

Key Components

MoW (Mixture of World Models)

The core model that combines multiple domain-specific world models (implemented as LoRA adapters) with a prototype-based routing mechanism. See tmow/modules/mow.py.

GraphRouter

A multi-layer GCN-based router that computes routing scores by comparing input graph embeddings with domain prototypes at multiple granularity levels. See tmow/modules/routers.py.

Test-time Prototype Refinement

During inference, the router can be refined using the refine_router() method, which updates prototypes based on similarity to the current environment without requiring gradient updates.

Experimental Results

TMoW demonstrates significant improvements over baselines:

  • 27.21% improvement over SayCanPay in zero-shot adaptation scenarios
  • 25.66% gain in few-shot expansion when constructing new world models

Evaluated on:

  • VirtualHome
  • ALFWorld
  • RLBench
  • Real-world robotic scenarios

Citation

@inproceedings{tmow2026,
  title={Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments},
  author={Jinwoo Jang, Minjong Yoo, Sihyung Yoon, Honguk Woo},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments [ICLR 2026]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors