Skip to content

feat(stablediffusion-ggml): LTX-2 support + LTX-2.3 GGUF gallery entries#9980

Merged
mudler merged 1 commit into
masterfrom
feat/ltx-sd-cpp
May 25, 2026
Merged

feat(stablediffusion-ggml): LTX-2 support + LTX-2.3 GGUF gallery entries#9980
mudler merged 1 commit into
masterfrom
feat/ltx-sd-cpp

Conversation

@localai-bot

Copy link
Copy Markdown
Collaborator

Summary

  • Wire audio_vae_path and embeddings_connectors_path through backend/go/stablediffusion-ggml/cpp/gosd.cpp so the upstream LTX-2 fields on sd_ctx_params_t (added in the currently pinned commit) are reachable from gallery entries.
  • New gallery/ltx-ggml.yaml template config matching the upstream LTX-2 CLI recipe: stablediffusion-ggml backend, sampler=euler, cfg_scale=6.0, step=30, vae_decode_only:false, diffusion_flash_attn:true, offload_params_to_cpu:true, diffusion_model flag.
  • Six new LTX-2.3 22B GGUF gallery entries based on the unsloth/LTX-2.3-GGUF repo:
    • dev: UD-Q4_K_M, Q4_K_M, Q8_0
    • distilled: UD-Q4_K_M, Q4_K_M, Q8_0
  • Each entry pulls the model GGUF + gemma-3-12b-it-qat-UD-Q4_K_XL.gguf text encoder + video VAE + audio VAE + embeddings_connectors safetensors. SHA256s fetched via the HF x-linked-etag method.

Upstream LTX-2 stable-diffusion.cpp doc: https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/ltx2.md

Test plan

  • Build the stablediffusion-ggml backend locally and confirm the new options are parsed (look for Found audio_vae_path / Found embeddings_connectors_path style log lines once a gallery entry is installed)
  • Install ltx-2.3-22b-distilled-ggml from the gallery (lowest-footprint variant) and run a short T2V request: width=1280, height=720, video_frames=33, fps=24
  • Repeat with an init_image to exercise the I2V path
  • Repeat with init_image + end_image for the FLF2V path
  • Confirm gallery/index.yaml round-trips through the gallery loader (994 entries; LTX-2.3 GGUF entries visible in the React UI gallery)

stable-diffusion.cpp gained LTX-2 video generation, which requires an
audio VAE and an embeddings_connectors safetensors in addition to the
usual diffusion model, VAE, and LLM text encoder. The pinned commit
exposes audio_vae_path and embeddings_connectors_path on
sd_ctx_params_t; wire both through the option parser so gallery entries
can point at the LTX-specific assets.

Ship six LTX-2.3 GGUF gallery entries (dev + distilled, UD-Q4_K_M /
Q4_K_M / Q8_0 each) backed by a new ltx-ggml.yaml template that
defaults to euler / cfg_scale 6.0 / vae_decode_only:false /
diffusion_flash_attn / offload_params_to_cpu — matching the upstream
LTX-2 CLI recipe. Each entry pulls the model GGUF plus the QAT
gemma-3-12b-it text encoder, video VAE, audio VAE, and embeddings
connectors needed for T2V / I2V / FLF2V.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude-Code]
@mudler mudler merged commit b02e3ff into master May 25, 2026
63 checks passed
@mudler mudler deleted the feat/ltx-sd-cpp branch May 25, 2026 11:00
@localai-bot localai-bot added the enhancement New feature or request label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants