ComfyUI-WanAnimatePlus

English | 中文

Multi-reference image injection and seamless video connection for ComfyUI's WanAnimate pipeline.

Overview

ComfyUI-WanAnimatePlus adds four major feature groups to the WanVideo workflow:

prefix_frames & transition_video

prefix_frames: allows passing 1–5 additional reference images for multi-reference guided generation
transition_video: allows passing the last 21 frames of the previous video segment for seamless video connection

When used together, canvas layout and frame offsets are automatically coordinated without conflicts.

Bernini

Supports Bernini models. Allows passing source video, reference video, or reference images as generation conditions. Supports v2v, rv2v, r2v, and t2v tasks.

Use cases:

Multi-shot video sequence generation
Video continuation / extension
Motion transfer with multi-reference control
Video editing with source video + reference images
Reference-to-video generation

SCAIL-2 Embeds

Adds a wrapper-native WanAnimatePlus SCAIL_2 Embeds node for SCAIL-2 models. It prepares reference image, driving pose, colored pose mask, reference mask, optional prefix/transition hard-freeze latents, and prefix-aligned colored masks for the WanAnimatePlus sampler.

Demo

prefix_frames & transition_video usage

prefix_frames demo

video_001.mp4

transition_video demo

video_002.mp4

Features

prefix_frames (Multi-Reference Injection)

Allows 1–5 additional reference images. Internally expands the canvas pixel space and encodes reference images across the front frames, with automatic frame offset coordination for control signals (pose / face).

Supports 1–5 reference images (truncated if exceeding 5)
Auto-resizes reference images to target resolution
Automatically aligns frame offsets for pose / face / bg / mask signals

transition_video (Seamless Video Connection)

Allows passing the last 21 frames of the previous video segment. Writes these pixel frames directly into the front of the generation canvas, with sampled+reversed padding for control signal offsets.

Automatically coordinates with prefix when both are used

Bernini

Generates condition latents from source video, reference video, and/or reference images via VAE encoding. Supports v2v, rv2v, r2v, and t2v — task is auto-detected from connected inputs.

Reference images kept at native aspect ratio
Compatible with context windows

SCAIL-2

Provides SCAIL-2 ref / pose / mask conditioning through WanAnimatePlus SCAIL_2 Embeds.

Encodes ref_image, pose_images, pose_image_mask, prefix_frames, prefix_mask, bg_image, and reference_image_mask
Aligns SCAIL-2 inputs to 32-pixel multiples before VAE encoding
Supports animation and replacement modes
Supports single-frame prefix reference encoding and optional transition_video hard-freeze conditioning
By default, prefix_frames are encoded as full-resolution reference latents and do not expand the output canvas; disable single_frame_prefix_encoding to use the legacy 37 front pixel-frame prefix layout
In single-frame prefix mode, prefix_mask follows the same reference-mask path as reference_image_mask
Supports context-window sampling; non-first windows can see prepended prefix/transition context without fusing those prepended predictions

Installation

Place this repository into ComfyUI's custom_nodes directory:

cd ComfyUI/custom_nodes
git clone https://github.com/wuwukaka/ComfyUI-WanAnimatePlus.git

Restart ComfyUI after installation.

Important: To use prefix_frames, transition_video, Bernini, or SCAIL_2 Embeds, you must replace the full workflow chain with WanAnimatePlus nodes. Mixing WanAnimatePlus nodes with original WanVideoWrapper nodes in the same workflow will result in degraded output.

Quick Start

Start ComfyUI and confirm the WanAnimatePlus nodes appear under the WanAnimatePlus category
Replace the entire workflow chain with WanAnimatePlus counterparts: ModelLoader, VAELoader, ContextOptions, AnimateEmbeds, Sampler, Decode, and supporting nodes
Do not mix original WanVideoWrapper nodes in the same workflow
Connect prefix_frames and/or transition_video inputs as needed
Example workflows are available in the example_workflows/ directory

Nodes

WanAnimatePlus exposes a complete workflow chain to avoid cross-package object mixing with the original WanVideoWrapper nodes.

Core nodes:

WanAnimatePlus ModelLoader
WanAnimatePlus VAELoader
WanAnimatePlus TextEncodeCached
WanAnimatePlus ClipVisionEncode
WanAnimatePlus ContextOptions
WanAnimatePlus AnimateEmbeds
WanAnimatePlus Sampler / WanAnimatePlus Samplerv2
WanAnimatePlus Scheduler / WanAnimatePlus Schedulerv2
WanAnimatePlus Decode / WanAnimatePlus Encode
WanAnimatePlus LoraSelect / WanAnimatePlus LoraSelectMulti / WanAnimatePlus SetLoRAs
WanAnimatePlus BlockSwap / WanAnimatePlus SetBlockSwap
WanAnimatePlus TorchCompileSettings
WanAnimatePlus SamplerExtraArgs
WanAnimatePlus Uni3C ControlnetLoader / WanAnimatePlus Uni3C Embeds
WanAnimatePlus Bernini
WanAnimatePlus SCAIL_2 Embeds

WanAnimatePlus AnimateEmbeds

Core node, replaces the original WanVideoAnimateEmbeds.

New inputs:

Input	Description
`prefix_frames`	1–5 additional reference images for multi-reference guided generation
`transition_video`	Last 21 frames of the previous video segment for seamless video connection

Other inputs are identical to the original WanVideoAnimateEmbeds: vae, width, height, num_frames, ref_images, pose_images, face_images, bg_images, mask, start_ref_image, clip_embeds, etc.

WanAnimatePlus Bernini

Generates condition latents from source video, reference video, and/or reference images for Bernini models.

Inputs:

Input	Description
`vae`	VAE model for encoding
`width` / `height` / `num_frames`	Output dimensions
`source_video`	Source video to edit/restyle (v2v/rv2v). Resized to width/height
`reference_video`	Moving content to composite (video insertion), native aspect
`reference_images`	Reference image(s) as in-context tokens (r2v/rv2v). Native aspect
`ref_max_size`	Max long-edge size for reference media (default 848)
`force_offload`	Offload VAE after encoding to save VRAM
`tiled_vae`	Use tiled VAE encoding for memory savings

The task (v2v, rv2v, r2v, t2v) is automatically inferred from which inputs are connected.

WanAnimatePlus SCAIL_2 Embeds

Creates SCAIL-2 conditioning for WanAnimatePlus sampling. Use this node with SCAIL-2 checkpoints that include the pose and mask streams.

Inputs:

Input	Description
`vae`	VAE model for encoding
`width` / `height` / `num_frames`	Target dimensions; width and height are aligned to multiples of 32
`ref_image`	Reference image for SCAIL-2 conditioning
`bg_image`	Optional single background image for animation mode; prepended as the first `prefix_frames` item with an internal white mask and ignored in replacement mode
`pose_images`	Driving pose video/images, encoded at half resolution
`pose_image_mask`	Colored per-identity pose mask sequence
`prefix_mask`	Optional colored mask images matching `prefix_frames`; expanded as `1+4+4...` and written into the prefix mask frames before mask latent encoding
`reference_image_mask`	Colored reference mask image
`replacement_mode`	Enables SCAIL-2 replacement-mode RoPE and reference-mask compositing
`preserve_main_ref_background`	Animation mode only; keeps the main reference image background when enabled, or uses `reference_image_mask` as a black-background alpha crop when disabled. Ignored in replacement mode
`single_frame_prefix_encoding`	Encodes `prefix_frames` as individual full-resolution reference latents instead of expanding the canvas; enabled by default
`prefix_frames`	Optional prefix images. In default single-frame mode these become reference-stream latents; with single-frame mode disabled they hard-freeze the front canvas
`transition_video`	Optional transition frames to hard-freeze at the front of the latent sequence
`clip_embeds`	Optional CLIP vision features from `WanAnimatePlus ClipVisionEncode`
`force_offload` / `tiled_vae`	Memory controls for VAE encoding

For short generations, context windows are optional. For long generations or low VRAM, context windows are recommended. In context-window mode, single-frame prefix references remain visible through the SCAIL-2 reference stream. Legacy canvas prefixes and transition latents are prepended for model context and removed before overlap fusion.

For SCAIL-2, the default single_frame_prefix_encoding mode does not expand or trim the output for prefix_frames. If transition_video is connected, the front canvas expands by 21 pixel frames and those 21 frames are trimmed after decoding. With single_frame_prefix_encoding disabled, prefix_frames use the legacy 37 front pixel-frame canvas layout, with transition frames placed at frames 17-36 when transition_video is also connected.

When bg_image is connected in animation mode, it consumes the first prefix slot and is handled as the first prefix image. The node internally adds a white mask for that background image only; user-provided prefix_frames and prefix_mask are then limited to four images each. In replacement mode, bg_image is ignored.

Quantization

The WanAnimatePlus ModelLoader supports mxfp8 weight quantization for reduced VRAM usage.

Option	Description
`mxfp8`	MXFP8 (Microscaling FP8) quantization format. Uses shared scaling factors across blocks of elements, providing better accuracy than per-tensor FP8 at similar memory savings. Requires compatible quantized model weights.

Hardware requirements: Hardware-accelerated MXFP8 matmul requires an NVIDIA Blackwell GPU (compute capability >= 10.0, e.g., RTX 5090 / B100 / B200). On non-Blackwell GPUs, MXFP8 weights are automatically dequantized to BF16 at load time for normal inference (no VRAM savings, but the model runs correctly).

Auto-detection: When the quantization dropdown is set to disabled, MXFP8 weights are auto-detected from the state dict by scanning for float8_e8m0fnu block scale tensors. No manual selection is needed.

Installation (optional): Hardware acceleration requires comfy-kitchen:

pip install comfy-kitchen==0.2.10

This package is not a hard dependency — it is only needed for Blackwell GPU acceleration. The model works without it via the dequantization fallback.

MXFP8 quantization support is derived from comfy-kitchen (Apache 2.0).

Project Structure

ComfyUI-WanAnimatePlus/
├─ wanvideo/                 # WanVideo core model code
├─ nodes.py                  # Core WanAnimatePlus embeds / encode / decode nodes
├─ nodes_sampler.py          # Core WanAnimatePlus sampler / scheduler nodes
├─ nodes_model_loading.py    # Core WanAnimatePlus model / VAE / LoRA / block swap nodes
├─ context_windows/          # Context-window scheduling
├─ cache_methods/            # Cache acceleration
├─ utils.py                  # Shared utilities
├─ docs/
│  └─ images/                # Documentation images
├─ example_workflows/        # Example workflows
├─ __init__.py               # Node registration entry point
├─ pyproject.toml
├─ requirements.txt
└─ LICENSE

FAQ

1. Nodes not showing after installation

Verify the repo path is ComfyUI/custom_nodes/ComfyUI-WanAnimatePlus
Ensure the original ComfyUI-WanVideoWrapper is also installed
Restart ComfyUI and search for WanAnimatePlus in the node list

2. Conflicts with original nodes?

No. All node names use the WanAnimatePlus prefix, completely avoiding conflicts with the original WanVideo prefixed nodes. Both can be installed simultaneously.

3. How many images for prefix_frames?

3 is recommended. Up to 5 are accepted (excess is truncated). The node works with fewer than 3 as well, but the coverage range will be smaller.

4. How many frames for transition_video?

Input is automatically cropped to 21 frames (padded with the first frame if insufficient).

Acknowledgments

Modified from kijai/ComfyUI-WanVideoWrapper. Deep respect to the original author for their tremendous contributions to the WanVideo ecosystem.

Contact

Bilibili: @wuwukasi
Email: wuwukawayi@gmail.com

Sponsorship

If you find this project helpful, consider supporting me! Your support is what keeps this project going and motivates me to continue improving it.

Every contribution, no matter how small, means a lot and helps me dedicate more time to development and new features. Thank you!

License

This project is an independently maintained fork / derivative project based on kijai/ComfyUI-WanVideoWrapper and is released under the Apache License, Version 2.0. Thanks again to kijai and the original contributors for their work.

Modified portions and newly added code are Copyright (c) 2026 wuwukasi/wuwukaka. See NOTICE for attribution and detailed modification notice requirements. Downstream projects that use or modify the WanAnimatePlus additions should preserve the copyright/modification notices and include a detailed notice in their README, NOTICE file, or equivalent attribution document. That notice should identify the wuwukasi/wuwukaka-derived modules, files, or feature areas and describe any downstream modifications made to those portions.

The wuwukasi/wuwukaka additions include WanAnimatePlus-specific node registration/renaming and integration code, prefix/transition video conditioning, Bernini in-context conditioning, SCAIL-2 embeds support including prefix-mask handling and sampler freeze/prepend integration, EverAnimate embeds support, sampler/context-window changes, cache/inference safeguards, and related custom-op handling.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
ATI		ATI
FlashVSR		FlashVSR
HuMo		HuMo
LongCat		LongCat
LongVie2		LongVie2
MTV		MTV
Ovi		Ovi
SCAIL		SCAIL
WanMove		WanMove
cache_methods		cache_methods
configs		configs
context_windows		context_windows
controlnet		controlnet
diffsynth/vram_management		diffsynth/vram_management
docs		docs
echoshot		echoshot
enhance_a_video		enhance_a_video
example_workflows		example_workflows
fantasyportrait		fantasyportrait
fantasytalking		fantasytalking
freeinit		freeinit
fun_camera		fun_camera
gguf		gguf
lynx		lynx
mocha		mocha
multitalk		multitalk
onetoall		onetoall
qwen		qwen
recammaster		recammaster
s2v		s2v
skyreels		skyreels
steadydancer		steadydancer
taehv		taehv
ultravico/sageattn		ultravico/sageattn
uni3c		uni3c
unianimate		unianimate
wanvideo		wanvideo
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README_ZH.md		README_ZH.md
__init__.py		__init__.py
custom_linear.py		custom_linear.py
fp8_optimization.py		fp8_optimization.py
latent_preview.py		latent_preview.py
nodes.py		nodes.py
nodes_deprecated.py		nodes_deprecated.py
nodes_model_loading.py		nodes_model_loading.py
nodes_sampler.py		nodes_sampler.py
nodes_utility.py		nodes_utility.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-WanAnimatePlus

Overview

prefix_frames & transition_video

Bernini

SCAIL-2 Embeds

Demo

prefix_frames & transition_video usage

prefix_frames demo

transition_video demo

Features

prefix_frames (Multi-Reference Injection)

transition_video (Seamless Video Connection)

Bernini

SCAIL-2

Installation

Quick Start

Nodes

WanAnimatePlus AnimateEmbeds

WanAnimatePlus Bernini

WanAnimatePlus SCAIL_2 Embeds

Quantization

Project Structure

FAQ

1. Nodes not showing after installation

2. Conflicts with original nodes?

3. How many images for prefix_frames?

4. How many frames for transition_video?

Acknowledgments

Contact

Sponsorship

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages