Skip to content

Feat anima lllite#2317

Merged
kohya-ss merged 8 commits intosd3from
feat-anima-lllite
May 7, 2026
Merged

Feat anima lllite#2317
kohya-ss merged 8 commits intosd3from
feat-anima-lllite

Conversation

@kohya-ss
Copy link
Copy Markdown
Owner

@kohya-ss kohya-ss commented Apr 25, 2026

An experimental implementation of ControlNet-LLLite for Anima.

This feature is experimental and may change. The hyperparameters are unknown. Community contributions and research are welcome.

The experimental ComfyUI node has been released as follows:
https://github.com/kohya-ss/ComfyUI-Anima-LLLite

The sample weight is uploaded here:
https://huggingface.co/kohya-ss/Anima-LLLite

The architecture has been significantly updated. Previous sample weights are no longer usable.

The model was trained for 4 epochs with 2000 pairs of Anima-generated images and pseudo-lineart images. Batch size was 6, and the learning rate was 2e-4.

accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 anima_train_control_net_lllite.py --pretrained_model_name_or_path path/to/anima-preview3-base.safetensors --vae path/to/qwen_image_vae.safetensors --qwen3 path/to/qwen_3_06b_base.safetensors  --output_dir=path/to/anima_lora --mixed_precision bf16 --save_precision bf16 --optimizer_type adamw8bit --learning_rate 2e-4 --max_data_loader_n_workers 2 --persistent_data_loader_workers --dataset_config=path/to/anima-lineart-v1.toml --lllite_mlp_dim 32 --cond_emb_dim 32 --output_name=anima-lllite-test-1 --logging_dir ./logs --log_prefix anima-lllite-test-1- --max_train_epochs 4  --highvram --vae_batch_size 8 --save_every_n_epochs 1 --cache_text_encoder_outputs_to_disk --cache_latents_to_disk --attn_mode flash --save_model_as safetensors  --timestep_sampling shift --discrete_flow_shift 3.0 --seed 42

I used tori29umai's LoRA for FramePack to create the lineart images. Thank you very much.

kohya-ss and others added 3 commits April 25, 2026 22:12
- Implemented LLLiteModuleDiT and ControlNetLLLiteDiT classes for enhanced attention mechanisms.
- Introduced callbacks on prompt start and end in sample_images function for better control during image sampling.
- Added support for additional network multipliers in line_to_prompt_dict function.
- Created a manual test script for end-to-end verification of LLLite functionality without real data.
- Updated training script to ensure dataset strategies are set correctly during debugging.

Co-authored-by: Copilot <copilot@github.com>
@kohya-ss
Copy link
Copy Markdown
Owner Author

It seems the prompt needs to match the content of the control image to some extent.

Control image:
lineart1

Generated image with ComfyUI:
ComfyUI_00282_

kohya-ss and others added 2 commits April 26, 2026 21:31
…weight keys)

Extend the Anima ControlNet-LLLite to v2 architecture and switch to a
self-describing weight key format.

Architecture (v2):
- Deeper conditioning1 trunk (Conv s=4 x2 + Conv s=1 + GN/SiLU + ResBlocks
  + final LayerNorm) for a wider receptive field.
- FiLM (gamma, beta) modulation on top of the concat-then-mid path inside
  each LLLite module, zero-initialized to start from identity.
- Per-module depth embedding (zero-init bias) added to the shared cond_emb.
- Atomic target_layers specifiers, including a new mlp_fc1_pre target that
  injects LLLite into the GPT2FeedForward fc1.
- Optional ASPP tail in conditioning1, switchable via --lllite_use_aspp.

Weight key format (named, sd-scripts LoRA-style):
- Per-module keys are prefixed by the target Linear's full path, e.g.
  lllite_dit_blocks_0_self_attn_q_proj.down.weight, so the file alone
  identifies which Linear each tensor belongs to.
- Shared conditioning encoder is stored under lllite_conditioning1.*.
- depth_embeds is split per-module as {name}.depth_embed (cond_emb_dim,)
  for self-describability and easier per-block surgery later.
- Legacy formats (lllite_modules.{i}.* keys; v1/v2 in-development snapshots)
  are rejected on load with an explicit error.

Tests:
- networks/control_net_lllite_anima.py __main__ covers parse_target_layers,
  module count for all preset/atomic combinations, ASPP on/off, zero-init
  forward (3D and 5D), depth_embeds non-zero invariance, save/load
  round-trip with the new named keys, and legacy-format reject.
- tests/manual_test_anima_lllite_dryrun.py exercises build / apply_to /
  forward / backward (grads on LLLite only) / save+load round-trip on a
  stub DiT.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…te training script for debug_dataset

doc: Update Anima LLLite doc.
@kohya-ss
Copy link
Copy Markdown
Owner Author

The architecture has been significantly updated. I have released four weights: line art, pose, depth map, and scribble.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an experimental ControlNet-LLLite implementation tailored for Anima’s DiT, including training/inference entrypoints, weight save/load format, and sampling-time control-image injection.

Changes:

  • Introduces the Anima-specific ControlNet-LLLite module (v2) with configurable target-layer attachment, conditioning trunk, FiLM modulation, and depth embeddings.
  • Adds a dedicated Anima training script and a minimal inference script, plus support for per-prompt --cn (control image) and --am (multiplier) during sample generation.
  • Updates prompt parsing and sample-image utilities to support per-prompt hooks and additional network multipliers.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
networks/control_net_lllite_anima.py Core Anima ControlNet-LLLite implementation, wrapper, and save/load helpers.
anima_train_control_net_lllite.py New training script that freezes DiT and trains only LLLite, including sample hooks for control images/multipliers.
anima_minimal_inference_control_net_lllite.py New inference script that patches anima_minimal_inference to attach LLLite and apply control images.
library/anima_train_utils.py Adds per-prompt start/end callbacks around sampling to inject transient state (e.g., LLLite cond image).
library/train_util.py Extends prompt-line parsing to support --am (additional network multiplier list).
sdxl_train_control_net_lllite.py Ensures dataset strategies are set before debug_dataset runs.
docs/anima_train_control_net_lllite.md Adds an Anima ControlNet-LLLite training/inference guide (needs alignment with actual weight key format).
tests/manual_test_anima_lllite_dryrun.py Manual CPU dry-run to validate module wiring, gradients, and save/load round-trip.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread networks/control_net_lllite_anima.py
Comment thread docs/anima_train_control_net_lllite.md
Comment thread docs/anima_train_control_net_lllite.md Outdated
Comment thread docs/anima_train_control_net_lllite.md Outdated
Comment thread docs/anima_train_control_net_lllite.md Outdated
…ormat

The saved .safetensors uses a named-key format (per-module `{lllite_name}.*`
with `depth_embeds` split into per-module `{lllite_name}.depth_embed`), but
the doc described the internal state_dict layout (`lllite_modules.{i}.*` and
stacked `depth_embeds`). load_lllite_weights rejects the latter as legacy,
so the doc could mislead users into producing incompatible weight files.
@kohya-ss kohya-ss merged commit cce63a1 into sd3 May 7, 2026
3 checks passed
@kohya-ss kohya-ss deleted the feat-anima-lllite branch May 7, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants