This repository provides FlashPortrait custom nodes for ComfyUI. It allows you to generate infinite-length portrait animations driven by a video, directly within your ComfyUI workflow.
Original Paper: FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction
- Automatic Model Downloading: No manual weight placement required. The loader fetches
Wan2.1andFlashPortraitmodels from Hugging Face automatically. - Infinite Length Support: Uses FlashPortrait's sliding window mechanism to process long driving videos.
- Flexible Inputs:
- Reference Image: Defines the identity (ID).
- Driving Video: Defines the motion and expression.
-
Clone the Repository: Go to your ComfyUI
custom_nodesfolder and run:cd ComfyUI/custom_nodes git clone https://github.com/okdalto/ComfyUI-FlashPortrait -
Install Dependencies:
cd ComfyUI/custom_nodes/ComfyUI-FlashPortrait pip install -r requirements.txtNote: Requires approx. 40GB VRAM for full model loading (BF16).
-
Restart ComfyUI.
You can find a basic ComfyUI workflow in examples/flash_portraits_example.json.
Simply drag and drop this file into your ComfyUI window to load the graph.
Loads the heavy models (Transformer, VAE) and face alignment tools.
- precision:
bf16(recommended),fp16,fp32. - GPU_memory_mode:
default,sequential_cpu_offload,model_cpu_offload_and_qfloat8. - download_missing: Enable this to automatically download models to
ComfyUI/models/flash_portrait/on first run.
Extracts motion and expression features from the driving video.
- images: Connect your driving video frames here (use
Load Videoor similar nodes). - source_fps: Frame rate of the source video (default 25.0).
- context_size / context_overlap: Controls the sliding window size for feature extraction.
The main generation node.
-
pipe: Connect from Loader.
-
head_emo_features: Connect from Feature Extractor.
-
image: The Reference Image (Identity). Only the first frame is used.
-
prompt / negative_prompt: Text guidance.
-
guidance_scale / text_cfg_scale / emo_cfg_scale: Control the influence of unconditional, text, and emotion guidance.
-
steps: Inference steps (default 30).
-
max_size: Output resolution height (e.g., 720).
This implementation is based on the amazing work by the original FlashPortrait team. Huge congratulations and thanks to: Shuyuan Tu, Yueming Pan, Yinming Huang, Xintong Han, Zhen Xing, Qi Dai, Kai Qiu, Chong Luo, Zuxuan Wu
And to the open-source projects that made this possible:
@article{tu2025flashportrait,
title={FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction},
author={Tu, Shuyuan and Pan, Yueming and Huang, Yinming and Han, Xintong and Xing, Zhen and Dai, Qi and Qiu, Kai and Luo, Chong and Wu, Zuxuan},
journal={arXiv preprint arXiv:2512.16900},
year={2025}
}