EgoX: Egocentric Video Generation from a Single Exocentric Video

Taewoong Kang*, Kinam Kim*, Dohyeon Kim*, Minho Park, Junha Hyung, and Jaegul Choo

DAVIAN Robotics, KAIST AI, SNU
arXiv 2025. (* indicates equal contribution)

🎬 Teaser Video

teaser.mp4

📋 TODO

🔹 This Week

Release inference code
Release model weights
Release data preprocessing code (for inference)

🔹 By End of December

Release training code
Release data preprocessing code (for train)

🔹 Ongoing

Release user-friendly interface

🛠️ Environment Setup

System Requirements

GPU: < 80GB (for inference) < 140GB (for train)
CUDA: 12.1 or higher
Python: 3.10
PyTorch: Compatible with CUDA 12.1

Installation

Create a conda environment and install dependencies:

# Create conda environment
conda create -n egox python=3.10 -y
conda activate egox

# Install PyTorch with CUDA 12.1
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Install other dependencies
pip install -r requirements.txt

📥 Model Weights Download

💾 Wan2.1-I2V-14B Pretrained Model

Download the Wan2.1-I2V-14B model and save it to the checkpoints/pretrained_model/ folder.

pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='Wan-AI/Wan2.1-I2V-14B-480P-Diffusers', local_dir='./checkpoints/pretrained_model/Wan2.1-I2V-14B-480P-Diffusers')"

💾 EgoX Model Weights Download

Download the trained EgoX LoRA weights using one of the following methods:

Option 1: Hugging Face

pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='DAVIAN-Robotics/EgoX', local_dir='./checkpoints/EgoX', allow_patterns='*.safetensors')"

Option 2: Google Drive

Download from Google Drive and save to the checkpoints/EgoX/ folder.

🚀 Inference

Quick Start with Example Data

For quick testing, the codebase includes example data in the example/ directory. You can run inference immediately:

# For in-the-wild example
bash scripts/infer_itw.sh

# For Ego4D example
bash scripts/infer_ego4d.sh

Edit the GPU ID and seed in the script if needed. Results will be saved to ./results/.

Custom Data Inference

To run inference with your own data, prepare the following file structure:

your_dataset/              # Your custom dataset folder
├── meta.json              # Meta information for each video
├── videos/                # Videos directory
│   └── take_name/
│       ├── ego_Prior.mp4
│       ├── exo.mp4
│       └── ...
└── depth_maps/            # Depth maps directory
    └── take_name/
        ├── frame_000.npy
        └── ...

meta.json - Meta information for each video

JSON file containing exocentric video path, egocentric prior video path, prompt, camera intrinsic and extrinsic parameters for each video. The structure includes test_datasets array with entries for each videos.

Example:

{
    "test_datasets": [
        {
            "exo_path": "./example/in_the_wild/videos/joker/exo.mp4",
            "ego_prior_path": "./example/in_the_wild/videos/joker/ego_Prior.mp4",
            "prompt": "[Exo view]\n**Scene Overview:**\nThe scene is set on a str...\n\n[Ego view]\n**Scene Overview:**\nFrom the inferred first-person perspective, the environment appears chaotic and filled with sm...",
            "camera_intrinsics": [
                [634.47327, 0.0, 392.0],
                [0.0, 634.4733, 224.0],
                [0.0, 0.0, 1.0]
            ],
            "camera_extrinsics": [
                [1.0, 0.0, 0.0, 0.0],
                [0.0, 1.0, 0.0, 0.0],
                [0.0, 0.0, 1.0, 0.0]
            ],
            "ego_intrinsics": [
                [150.0, 0.0, 255.5],
                [0.0, 150.0, 255.5],
                [0.0, 0.0, 1.0]
            ],
            "ego_extrinsics": [
                [[0.6263, 0.7788, -0.0336, 0.3432],
                 [-0.0557, 0.0018, -0.9984, 2.3936],
                 [-0.7776, 0.6272, 0.0445, 0.1299]],
                ...
            ]
        },
        ...
    ]
}

To prepare your own dataset, follow the instruction from here.

Constraints

Since EgoX is trained on the Ego-Exo4D dataset where exocentric view camera poses are fixed, you must provide exocentric videos with fixed camera poses as input during inference. Also, the model is trained on 448x448(ego), 448x784(exo) resolutions and 49 frames. Please preprocess your videos to these resolutions.

Custom dataset init structure

Before running the script, you need to create a custom dataset folder with the following structure:

your_dataset/              # Your custom dataset folder
├── videos/                # Videos directory
    └── take_name/
        └──  exo.mp4

Then, by using meta_init.py, you can create a meta.json file with the following command:

python meta_init.py --folder_path ./your_dataset --output_json ./your_dataset/meta.json --overwrite

your_dataset/              # Your custom dataset folder
├── meta.json              # Meta information for each video
├── videos/                # Videos directory
    └── take_name/
        └──  exo.mp4

Then, you can use caption.py to generate caption for each video with this command:

python caption.py --json_file ./your_dataset/meta.json --output_json ./your_dataset/meta.json --overwrite

Make sure that your api key is properly set in caption.py.

Finally, follow the instruction from here. Then you can get depth maps, camera intrinsic, ego camera extrinsics for each video.

your_dataset/              # Your custom dataset folder
├── meta.json              # Meta information for each video
├── videos/                # Videos directory
    └── take_name/
        ├── ego_Prior.mp4
        ├── exo.mp4
        └── ...
└── depth_maps/            # Depth maps directory
    └── take_name/
        ├── frame_000.npy
        └── ...

Then, modify scripts/infer_itw.sh (or create a new script) to point to your data paths:

python3 infer.py \
    --meta_data_file ./example/your_dataset/meta.json \
    --model_path ./checkpoints/pretrained_model/Wan2.1-I2V-14B-480P-Diffusers \
    --lora_path ./checkpoints/EgoX/pytorch_lora_weights.safetensors \
    --lora_rank 256 \
    --out ./results \
    --seed 42 \
    --use_GGA \
    --cos_sim_scaling_factor 3.0 \
    --in_the_wild

🌟 Star History

🙏 Acknowledgements

This project is built upon the following works:

📝 Citation

If you use this dataset or code in your research, please cite our paper:

@misc{kang2025egoxegocentricvideogeneration,
      title={EgoX: Egocentric Video Generation from a Single Exocentric Video}, 
      author={Taewoong Kang and Kinam Kim and Dohyeon Kim and Minho Park and Junha Hyung and Jaegul Choo},
      year={2025},
      eprint={2512.08269},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.08269}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
EgoX-EgoPriorRenderer @ 47c70f3		EgoX-EgoPriorRenderer @ 47c70f3
configs_acc		configs_acc
core		core
example		example
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
caption.py		caption.py
finetune.py		finetune.py
infer.py		infer.py
meta_init.py		meta_init.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EgoX: Egocentric Video Generation from a Single Exocentric Video

🎬 Teaser Video

📋 TODO

🔹 This Week

🔹 By End of December

🔹 Ongoing

🛠️ Environment Setup

System Requirements

Installation

📥 Model Weights Download

💾 Wan2.1-I2V-14B Pretrained Model

💾 EgoX Model Weights Download

🚀 Inference

Quick Start with Example Data

Custom Data Inference

Constraints

🌟 Star History

🙏 Acknowledgements

📝 Citation

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

DAVIAN-Robotics/EgoX

Folders and files

Latest commit

History

Repository files navigation

EgoX: Egocentric Video Generation from a Single Exocentric Video

🎬 Teaser Video

📋 TODO

🔹 This Week

🔹 By End of December

🔹 Ongoing

🛠️ Environment Setup

System Requirements

Installation

📥 Model Weights Download

💾 Wan2.1-I2V-14B Pretrained Model

💾 EgoX Model Weights Download

🚀 Inference

Quick Start with Example Data

Custom Data Inference

Constraints

🌟 Star History

🙏 Acknowledgements

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages