Over the last few months I have spent a lot of time sampling from this model. Some tips:
1) You can generate videos even with small GPUs (just decrease number of frames you decode at a time as this eats most VRAM). 14 frames (decoding one at a time) should be less than 20GB VRAM
Stability releases Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
model: huggingface.co/stabilityai/st…
present Stable Video Diffusion — a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation.
00:00









