Merged
Conversation
- Add .ai/pipelines.md: pipeline conventions and gotchas (config-derived values, no_grad discipline, reinventing scheduler logic, subclassing variants, # Copied from annotations). - models.md: add Attention masks subsection inside Attention pattern; fold reference-implementations skim into conventions; consolidate __init__.py / _import_structure gotchas; trim gotchas covered by AGENTS.md (silent fallbacks, config serialization gap) or pipelines.md (no_grad, guider/scheduler reuse). - review-rules.md: collapse to a short reviewer checklist that points into AGENTS / models / pipelines / modular gotchas; only LLM-specific pattern (ephemeral context) lives here directly. - AGENTS.md: collapse defensive-code / unused-params / backwards-compat / deprecation rules into one umbrella bullet; replace inline pipeline bullet list with a pointer to pipelines.md. - SKILL.md (model-integration): trim pre-PR self-review to a one-line pointer. Sourced from the ACE-Step PR (#13095) review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
yiyixuxu
commented
Apr 27, 2026
Co-authored-by: YiYi Xu <yixu310@gmail.com>
yiyixuxu
commented
Apr 27, 2026
yiyixuxu
commented
Apr 27, 2026
| ## Common model conventions | ||
|
|
||
| - Models use `ModelMixin` with `register_to_config` for config serialization | ||
| When adding a new transformer (or reviewing one), skim `transformer_flux.py`, `transformer_flux2.py`, `transformer_qwenimage.py`, `transformer_wan.py` first to establish the pattern. Most conventions (mixin set, file structure, naming, gradient-checkpointing implementation, `_no_split_modules` / `_supports_cache_class` settings, etc.) are easiest to internalize by comparison rather than from a fixed list. |
Collaborator
Author
There was a problem hiding this comment.
I think it works better this way, we provide the list of canonical references for them and explicitly to ask them to derive common conventions themselves
yiyixuxu
commented
Apr 27, 2026
| @@ -0,0 +1,62 @@ | |||
| # Pipeline conventions and rules | |||
Collaborator
Author
There was a problem hiding this comment.
added a pipeline doc for agent
sayakpaul
reviewed
Apr 27, 2026
| - Support `generator` parameter for reproducibility | ||
| - Use `self.progress_bar(timesteps)` for progress tracking | ||
| - Don't subclass an existing pipeline for a variant — DO NOT use an existing pipeline class (e.g., `FluxPipeline`) to override another pipeline (e.g., `FluxImg2ImgPipeline`) which will be a part of the core codebase (`src`) | ||
| - See [pipelines.md](pipelines.md) for pipeline conventions, patterns, and gotchas. |
| 6. **Config serialization gaps.** Every `__init__` parameter in a `ModelMixin` subclass must be captured by `register_to_config`. If you add a new param but forget to register it, `from_pretrained` will silently use the default instead of the saved value. | ||
|
|
||
| 7. **Forgetting to update `_import_structure` and `_lazy_modules`.** The top-level `src/diffusers/__init__.py` has both -- missing either one causes partial import failures. | ||
| 4. **Capability flags without matching implementation.** Class attributes like `_supports_cache_class`, `_no_split_modules`, `_supports_gradient_checkpointing`, `_supports_quantization` advertise a contract. Setting one without the corresponding logic (e.g. `_supports_gradient_checkpointing = True` with no `_set_gradient_checkpointing` method, or no `if gradient_checkpointing` branches in `forward`) means training code silently no-ops or errors deep inside `torch.utils.checkpoint`. For `_supports_cache_class` / `_no_split_modules` specifically, wrong values cause silent correctness bugs or OOM. Copy from a similar model, verify the corresponding logic is in place, and for inference-only ports just drop the flag. |
Member
There was a problem hiding this comment.
_supports_quantization -- we don't have this attribute.
stevhliu
reviewed
Apr 27, 2026
Member
stevhliu
left a comment
There was a problem hiding this comment.
thanks, very well thought out!
|
|
||
| Common mistakes are covered in the common-mistakes / gotcha sections in [AGENTS.md](AGENTS.md), [models.md](models.md), [pipelines.md](pipelines.md), and [modular.md](modular.md). Additionally, watch for below patterns that aren't covered there: | ||
|
|
||
| - **Ephemeral context.** Comments, docstrings, and files that only made sense to the current PR's author or reviewer don't help a future reader/user/developer. Examples: `# per reviewer comment on PR #NNNN`, `# as discussed in review`, `# TODO from offline chat`, debug printouts. Same for files: parity harnesses, comparison scripts, anything in `scripts/` with hardcoded developer paths or imports from the reference repo. State the *reason* so the comment stands alone, or drop it. |
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
… modes `_supports_quantization` and `_supports_cache_class` don't exist in diffusers (sayak flagged the first; the second was also fabricated). Replaced with the two flags where the "advertised but unbacked" pattern is a real mistake: `_supports_gradient_checkpointing` (needs `if self.gradient_checkpointing:` branches in forward) and `_no_split_modules` (needs correct block class names for `device_map`). Dropped `_supports_group_offloading` — its realistic failure mode is forgetting to opt out, not opt in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
terarachang
pushed a commit
to terarachang/diffusers
that referenced
this pull request
Apr 30, 2026
* [agents docs] add pipelines.md and restructure review rules - Add .ai/pipelines.md: pipeline conventions and gotchas (config-derived values, no_grad discipline, reinventing scheduler logic, subclassing variants, # Copied from annotations). - models.md: add Attention masks subsection inside Attention pattern; fold reference-implementations skim into conventions; consolidate __init__.py / _import_structure gotchas; trim gotchas covered by AGENTS.md (silent fallbacks, config serialization gap) or pipelines.md (no_grad, guider/scheduler reuse). - review-rules.md: collapse to a short reviewer checklist that points into AGENTS / models / pipelines / modular gotchas; only LLM-specific pattern (ephemeral context) lives here directly. - AGENTS.md: collapse defensive-code / unused-params / backwards-compat / deprecation rules into one umbrella bullet; replace inline pipeline bullet list with a pointer to pipelines.md. - SKILL.md (model-integration): trim pre-PR self-review to a one-line pointer. Sourced from the ACE-Step PR (huggingface#13095) review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * Apply suggestion from @yiyixuxu * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix capability-flags gotcha: drop fake attrs, tighten to real failure modes `_supports_quantization` and `_supports_cache_class` don't exist in diffusers (sayak flagged the first; the second was also fabricated). Replaced with the two flags where the "advertised but unbacked" pattern is a real mistake: `_supports_gradient_checkpointing` (needs `if self.gradient_checkpointing:` branches in forward) and `_no_split_modules` (needs correct block class names for `device_map`). Dropped `_supports_group_offloading` — its realistic failure mode is forgetting to opt out, not opt in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
ghostxsl
pushed a commit
to ghostxsl/diffusers
that referenced
this pull request
May 6, 2026
* [agents docs] add pipelines.md and restructure review rules - Add .ai/pipelines.md: pipeline conventions and gotchas (config-derived values, no_grad discipline, reinventing scheduler logic, subclassing variants, # Copied from annotations). - models.md: add Attention masks subsection inside Attention pattern; fold reference-implementations skim into conventions; consolidate __init__.py / _import_structure gotchas; trim gotchas covered by AGENTS.md (silent fallbacks, config serialization gap) or pipelines.md (no_grad, guider/scheduler reuse). - review-rules.md: collapse to a short reviewer checklist that points into AGENTS / models / pipelines / modular gotchas; only LLM-specific pattern (ephemeral context) lives here directly. - AGENTS.md: collapse defensive-code / unused-params / backwards-compat / deprecation rules into one umbrella bullet; replace inline pipeline bullet list with a pointer to pipelines.md. - SKILL.md (model-integration): trim pre-PR self-review to a one-line pointer. Sourced from the ACE-Step PR (huggingface#13095) review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * Apply suggestion from @yiyixuxu * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix capability-flags gotcha: drop fake attrs, tighten to real failure modes `_supports_quantization` and `_supports_cache_class` don't exist in diffusers (sayak flagged the first; the second was also fabricated). Replaced with the two flags where the "advertised but unbacked" pattern is a real mistake: `_supports_gradient_checkpointing` (needs `if self.gradient_checkpointing:` branches in forward) and `_no_split_modules` (needs correct block class names for `device_map`). Dropped `_supports_group_offloading` — its realistic failure mode is forgetting to opt out, not opt in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.