build(deps): bump the pip group across 3 directories with 19 updates#3
Closed
dependabot[bot] wants to merge 1 commit into
Closed
build(deps): bump the pip group across 3 directories with 19 updates#3dependabot[bot] wants to merge 1 commit into
dependabot[bot] wants to merge 1 commit into
Conversation
Updates the requirements on [datasets](https://github.com/huggingface/datasets), [setuptools](https://github.com/pypa/setuptools), [setuptools-scm](https://github.com/pypa/setuptools-scm), [pandas](https://github.com/pandas-dev/pandas), [huggingface-hub](https://github.com/huggingface/huggingface_hub), [transformers](https://github.com/huggingface/transformers), [trl](https://github.com/huggingface/trl), data-designer-engine, [pytest](https://github.com/pytest-dev/pytest), [pytest-rerunfailures](https://github.com/pytest-dev/pytest-rerunfailures), [scikit-learn](https://github.com/scikit-learn/scikit-learn), [torchao](https://github.com/pytorch/ao), [chardet](https://github.com/chardet/chardet), [faker](https://github.com/joke2k/faker), [fsspec](https://github.com/fsspec/filesystem_spec), [python-json-logger](https://github.com/nhairs/python-json-logger), [sqlfluff](https://github.com/sqlfluff/sqlfluff), [data-designer](https://github.com/NVIDIA-NeMo/DataDesigner) and data-designer-config to permit the latest version. Updates `datasets` to 4.5.0 - [Release notes](https://github.com/huggingface/datasets/releases) - [Commits](huggingface/datasets@3.4.1...4.5.0) Updates `setuptools` from 80.9.0 to 82.0.1 - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](pypa/setuptools@v80.9.0...v82.0.1) Updates `setuptools-scm` from 9.2.0 to 9.2.2 - [Release notes](https://github.com/pypa/setuptools-scm/releases) - [Changelog](https://github.com/pypa/setuptools-scm/blob/v9.2.2/CHANGELOG.md) - [Commits](pypa/setuptools-scm@v9.2.0...v9.2.2) Updates `pandas` to 3.0.2 - [Release notes](https://github.com/pandas-dev/pandas/releases) - [Commits](pandas-dev/pandas@v2.0.0...v3.0.2) Updates `datasets` from 4.3.0 to 4.8.4 - [Release notes](https://github.com/huggingface/datasets/releases) - [Commits](huggingface/datasets@3.4.1...4.5.0) Updates `huggingface-hub` from 0.36.2 to 1.9.0 - [Release notes](https://github.com/huggingface/huggingface_hub/releases) - [Commits](huggingface/huggingface_hub@v0.36.2...v1.9.0) Updates `transformers` from 4.57.6 to 5.5.0 - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.57.6...v5.5.0) Updates `trl` from 0.23.1 to 1.0.0 - [Release notes](https://github.com/huggingface/trl/releases) - [Changelog](https://github.com/huggingface/trl/blob/main/RELEASE.md) - [Commits](huggingface/trl@v0.23.1...v1.0.0) Updates `data-designer-engine` from 0.5.4 to 0.5.5 Updates `pandas` from 2.3.3 to 3.0.2 - [Release notes](https://github.com/pandas-dev/pandas/releases) - [Commits](pandas-dev/pandas@v2.0.0...v3.0.2) Updates `pytest` to 9.0.2 - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst) - [Commits](pytest-dev/pytest@1.0.0b3...9.0.2) Updates `pytest-rerunfailures` from 15.1 to 16.1 - [Changelog](https://github.com/pytest-dev/pytest-rerunfailures/blob/master/CHANGES.rst) - [Commits](pytest-dev/pytest-rerunfailures@15.1...16.1) Updates `scikit-learn` from 1.7.1 to 1.8.0 - [Release notes](https://github.com/scikit-learn/scikit-learn/releases) - [Commits](scikit-learn/scikit-learn@1.7.1...1.8.0) Updates `torchao` from 0.14.0 to 0.17.0 - [Release notes](https://github.com/pytorch/ao/releases) - [Commits](https://github.com/pytorch/ao/commits/v0.17.0) Updates `chardet` to 7.4.0.post2 - [Release notes](https://github.com/chardet/chardet/releases) - [Changelog](https://github.com/chardet/chardet/blob/main/docs/changelog.rst) - [Commits](chardet/chardet@3.0.2...7.4.0.post2) Updates `faker` to 40.12.0 - [Release notes](https://github.com/joke2k/faker/releases) - [Changelog](https://github.com/joke2k/faker/blob/master/CHANGELOG.md) - [Commits](joke2k/faker@v20.1.0...v40.12.0) Updates `fsspec` to 2026.3.0 - [Commits](fsspec/filesystem_spec@2025.3.0...2026.3.0) Updates `python-json-logger` to 4.1.0 - [Release notes](https://github.com/nhairs/python-json-logger/releases) - [Changelog](https://github.com/nhairs/python-json-logger/blob/main/docs/changelog.md) - [Commits](nhairs/python-json-logger@v3.0.0...v4.1.0) Updates `sqlfluff` to 4.1.0 - [Release notes](https://github.com/sqlfluff/sqlfluff/releases) - [Changelog](https://github.com/sqlfluff/sqlfluff/blob/main/CHANGELOG.md) - [Commits](sqlfluff/sqlfluff@3.2.0...4.1.0) Updates `data-designer` from 0.5.4 to 0.5.5 - [Release notes](https://github.com/NVIDIA-NeMo/DataDesigner/releases) - [Commits](NVIDIA-NeMo/DataDesigner@v0.5.4...v0.5.5) Updates `data-designer-config` from 0.5.4 to 0.5.5 --- updated-dependencies: - dependency-name: datasets dependency-version: 4.5.0 dependency-type: direct:development dependency-group: pip - dependency-name: setuptools dependency-version: 82.0.1 dependency-type: direct:development update-type: version-update:semver-major dependency-group: pip - dependency-name: setuptools-scm dependency-version: 9.2.2 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: pip - dependency-name: pandas dependency-version: 3.0.2 dependency-type: direct:production dependency-group: pip - dependency-name: datasets dependency-version: 4.8.4 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: pip - dependency-name: huggingface-hub dependency-version: 1.9.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: pip - dependency-name: transformers dependency-version: 5.5.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: pip - dependency-name: trl dependency-version: 1.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: pip - dependency-name: data-designer-engine dependency-version: 0.5.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: pip - dependency-name: pandas dependency-version: 3.0.2 dependency-type: direct:production update-type: version-update:semver-major dependency-group: pip - dependency-name: pytest dependency-version: 9.0.2 dependency-type: direct:production dependency-group: pip - dependency-name: pytest-rerunfailures dependency-version: '16.1' dependency-type: direct:production update-type: version-update:semver-major dependency-group: pip - dependency-name: scikit-learn dependency-version: 1.8.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: pip - dependency-name: torchao dependency-version: 0.17.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: pip - dependency-name: chardet dependency-version: 7.4.0.post2 dependency-type: direct:production dependency-group: pip - dependency-name: faker dependency-version: 40.12.0 dependency-type: direct:production dependency-group: pip - dependency-name: fsspec dependency-version: 2026.3.0 dependency-type: direct:production dependency-group: pip - dependency-name: python-json-logger dependency-version: 4.1.0 dependency-type: direct:production dependency-group: pip - dependency-name: sqlfluff dependency-version: 4.1.0 dependency-type: direct:production dependency-group: pip - dependency-name: data-designer dependency-version: 0.5.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: pip - dependency-name: data-designer-config dependency-version: 0.5.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com>
Author
|
Superseded by #21. |
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
… MPS + base namespace for PR unslothai#5754 Round 12 reviewer findings. Backend correctness (P1) * core/inference/diffusion.py load_model: GGUF branch now handles an absolute local directory passed as repo_id by joining Path(repo_id) / gguf_filename directly instead of handing the path to hf_hub_download (which raises HFValidationError because the path is not 'namespace/repo'). Closes round 12 review #1 -- the load request advertised 'local path' support but actually only worked for Hub repo ids. Delete guard precision (P1) * routes/models.py /delete-finetuned + /delete-cached: diffusion guard now consults gguf_filename from status() and ALLOWS per-variant deletes that target a different quant than the one the loaded pipeline is reading. Loading 'Q4_K_S' no longer blocks deleting 'Q8_0' from the same repo / export directory (round 12 reviews #3 and #4). Accelerator (P2) * core/inference/diffusion.py _drain_cuda_cache: also calls torch.mps.empty_cache() when the MPS backend is the active accelerator. Apple Silicon swaps now actually return held VRAM instead of leaving it pinned in the Metal allocator (round 12 review #10). Smart base repo (P2) * core/inference/diffusion.py _smart_base_repo: only inspects the LAST segment of the repo id / path for the 'base' / '9b' tokens. A namespace like baseorg/FLUX.2-klein-4B-GGUF or a parent directory like /home/me/.cache/base/... no longer falsely selects the Base variant (round 12 review #9).
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_release_safetensors_chat_for`` now re-reads ``active_model_name`` and ``loading_models`` after each unload AND runs a final sweep against the initial owned-name set. The previous helper trusted ``unload_model() -> True`` even though the orchestrator can respond ``unloaded`` while still holding weights or a concurrent ``load`` can repopulate the tracker between calls. Per-name and global post-state mismatches now raise HTTP 503 so the caller retries. P1 #2: same post-state guarantee inside ``_release_chat_backend_for_diffusion`` for direct backend callers. ``DiffusionBackend.load_model`` now raises RuntimeError when the safetensors tracker still owns a previously-resident name after the unload, matching the route-level helper. The route layer's existing classifier maps the new wording to HTTP 503. P1 #3: ``DiffusionBackend.load_model`` now preflights the full diffusers repo (or explicit GGUF ``base_repo``) via ``hf_hub_download(filename="model_index.json")`` BEFORE the chat / export unload runs. The GGUF path was already covered by the existing ``hf_hub_download(gguf_filename)`` round-trip; the full-repo path used to skip validation and let a typo / private / gated repo only surface inside ``from_pretrained`` AFTER the user's chat model was already dropped. Local paths are checked structurally (must be a directory containing ``model_index.json``) so we do not network-round-trip for an on-disk miss. Error messages route through ``_display_repo_id`` so an absolute filesystem path does not leak the operator's layout. P1 #6: ``/api/inference/unload`` (the direct chat unload endpoint) now treats ``unload_model() -> False`` AND a leftover state (``is_loaded`` / ``is_active`` / ``loading_model_identifier`` for GGUF, ``active_model_name`` / ``loading_models`` for safetensors) as 503 instead of unconditionally responding ``status="unloaded"``. The UI used to show the model as gone while the backend still owned VRAM. P2 #7: extended the /images/load RuntimeError -> HTTPException marker list with ``still active or loading after unload`` and ``still loading after unload``. Round 18 introduced these exact phrasings on the backend side; without the extension a retryable unload failure was returning HTTP 400 to the user instead of 503. P2 #8: removed the unused ``unsloth_backend = get_inference_backend()`` eager construction in the GGUF chat-load branch. Eager construction made the GGUF-only path needlessly fail or pay startup cost when the safetensors backend was unavailable / lazy; ``_release_safetensors_chat_for`` already handles that case as a no-op. All 85 diffusion-relevant + 98 related backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_preflight_full_diffusers_repo(effective_base, hf_token)`` now runs for every load mode, including the GGUF-with-auto-base path. Round 19 only preflighted the full repo or an explicit ``base_repo``, so an auto-picked companion that turned out to be gated / private / missing still unloaded the user's chat model before ``from_pretrained`` failed. ``effective_base`` is the same value that feeds every downstream allocation, so preflighting it unconditionally catches all three modes. P1 #2: ``diffusers.GGUFQuantizationConfig`` (which imports the ``gguf`` package at construction time) is now built up front, inside the same try block that surfaces "Re-run Studio setup". Previously the missing-dependency exception fired AFTER ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` had already taken the chat / export models down. The downstream from_single_file call reuses the same ``quant_config`` reference. P1 #4: ``studio/backend/requirements/studio.txt`` now lists ``diffusers>=0.37.0`` and ``gguf>=0.10.0``. These were only in the extras files, so fresh standard Studio installs failed on /images/load with the round 20 P1 #2 dependency error message. P1 #5: ``LoadRequest``, ``UnloadRequest``, and ``ValidateModelRequest`` now apply the same control-character + embedded-HF-token validators that ``DiffusionLoadRequest`` already had. /api/inference/load, /api/inference/validate, and /api/inference/unload used to accept newline / tab / control characters in ``model_path`` (log-line smuggling) and URL-form ``https://hf_xxxxx@huggingface.co/...`` (credential leak through structured log sinks). P2 #6: ``_collapse_local`` in the diffusion load-error scrubber now resolves relative candidates and adds the absolute form to the substring set. A relative ``exports/my-flux`` used to leak ``/mnt/disks/.../exports/my-flux/...`` via downstream library errors because the scrubber only matched the original literal. Replacement is longest-first so a leaf-only context survives. All 85 diffusion-relevant + 35 related model-validation tests pass locally. (P1 #3 cross-workload GPU handoff lock is deferred: deserves a focused design pass across /images/load, /chat/load (both branches), /training/start, and /export/load to pick a lock boundary that does not deadlock against the backend load locks or stall the SSE log stream.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1 + #2: ``LoadRequest._no_embedded_hf_tokens`` and ``ValidateModelRequest._no_embedded_hf_tokens`` now cover ``gguf_variant`` in addition to ``model_path``. A caller could pass a variant like ``Q4_K_M-hf_xxxxxxxx`` that flowed into structured log sinks via the GGUF resolver path; the matching ``DiffusionLoadRequest`` validator already covered every string field, so this restores parity. P1 #3: ``/api/inference/unload`` now also matches the llama ``loading_model_identifier`` when picking the GGUF branch. A pending GGUF download (``is_active`` still False, ``loading_model_identifier`` populated) used to fall through to the safetensors branch and respond ``status="unloaded"`` while llama-server kept downloading. P1 #4 + #5: the final safetensors-handoff sweeps (route-level ``_release_safetensors_chat_for`` and backend ``_release_chat_backend_for_diffusion``) now check ``active_model_name`` and ``loading_models`` WITHOUT the initial ``owned_names`` filter. A concurrent ``/load`` that landed AFTER the snapshot was previously ignored, so a chat model that began loading during the unload window let training / export / GGUF chat / diffusion start anyway and race the new chat for VRAM. P2 #6: added ``_preflight_diffusers_subfolder_config`` and invoked it for GGUF loads with a transformer class (``effective_base``, ``"transformer"``). A custom base companion that had ``model_index.json`` but lacked ``transformer/config.json`` previously passed the round 19 preflight, unloaded chat, then failed inside ``from_single_file``. P2 #7: ``_scrub_validation_obj`` in main.py also scrubs string dict KEYS. Pydantic ``string_type`` errors surface ``input`` verbatim, and a malformed payload like ``{"repo_id": {"hf_xxxxx": "owner/repo"}}`` would otherwise leak the token through the 422 response body. All 85 diffusion-relevant + 35 model-validation tests pass locally. Existing fakes for ``hf_hub_download`` updated to accept the new ``subfolder=`` kwarg the round 21 preflight uses. (P1 #3 cross-workload GPU handoff lock from round 20 is still deferred; round 21's P1 #4 / #5 raised the sweep-level guarantee, which closes the most common race without the deadlock risk of holding a process-wide lock across the entire load.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``TrainingStartRequest.model_name`` now runs the same control-character and embedded-HF-token validators that the chat and diffusion request models gained in rounds 5 / 15 / 20 / 21. ``/api/training/start`` previously accepted newline / tab / control characters and URL-form ``hf_xxxxx`` tokens that flowed into structured-log sinks via "Loading model %s" lines. P1 #2: ``_run_with_helper`` in ``utils/datasets/llm_assist.py`` now skips the helper GGUF when the diffusion image backend reports loaded / loading. The public chat / training / export routes already do this through ``_release_diffusion_for``, but this dataset-side helper loaded llama-server directly with no diffusion guard, so an Images-page allocation would race the helper for VRAM. New ``_diffusion_image_model_busy`` helper fails closed (treats status() failure as busy) so the resident image model is preserved instead of being overwritten. P1 #3: same ``_diffusion_image_model_busy`` guard added to ``_run_multi_pass_advisor`` (the dataset conversion advisor), which has the same direct llama.cpp load shape. P2 #4: the early "Could not infer a diffusion family" RuntimeError now routes ``repo_id`` through ``_display_repo_id`` before formatting. A local absolute path that did not match any known family used to leak the operator's filesystem layout via the 400 response body, last_error, and log line. All 97 diffusion + training-validation + related tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1 + #2 + #6: extended the chat / diffusion / training identifier hardening to every export-side request model. ExportCommonOptions (parent of ExportMergedModelRequest / ExportBaseModelRequest / ExportLoRAAdapterRequest) now applies _no_control_chars and _reject_embedded_hf_token to repo_id and base_model_id; ExportGGUFRequest gets the same on its repo_id plus a control-char check on quantization_method; and LoadCheckpointRequest validates checkpoint_path. Previously "/api/export/*" accepted newline-smuggled identifiers and URL-form ``hf_xxxxx`` tokens that flowed into log lines. P1 #3 + #4: ``_run_with_helper`` and ``_run_multi_pass_advisor`` now use a shared ``_gpu_workload_busy_for_helper`` that gates on diffusion (round 22 already), training, AND export. The round 22 guard only checked diffusion, so the dataset helper / advisor could still load llama-server on top of an active training run or a resident export checkpoint. Each step fails closed (unverifiable status counts as busy) so the user's primary workload is preserved. P1 #5: PublishDatasetRequest in models/data_recipe.py also applies the identifier hardening to repo_id; the publish path previously accepted control characters and URL-form tokens. P1 #7-10: added _validate_logged_identifier helper to routes/models.py and applied it to the path / query parameter endpoints that flow into logger.info(...) calls -- ``/config/{model_name}``, ``/check-vision/{model_name}``, ``/check-embedding/{model_name}``, ``/gguf-variants``. Mapped the validator's ValueError to HTTP 422 so the client sees the same shape as a Pydantic validation failure. P2 #11 + #12: ``Loading diffusion model %s`` and ``Diffusion load failed for %s`` log lines route ``repo_id`` / ``effective_base`` through ``_display_repo_id`` (collapses absolute local paths to the leaf, still scrubs HF tokens) instead of plain ``_redact_hf_tokens``. The error path was already collapsed in the user-facing 400 / RuntimeError, but the structured-log lines kept the full path. All 97 diffusion + training-validation + related tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_release_safetensors_chat_for`` now re-reads ``active_model_name`` and ``loading_models`` after each unload AND runs a final sweep against the initial owned-name set. The previous helper trusted ``unload_model() -> True`` even though the orchestrator can respond ``unloaded`` while still holding weights or a concurrent ``load`` can repopulate the tracker between calls. Per-name and global post-state mismatches now raise HTTP 503 so the caller retries. P1 #2: same post-state guarantee inside ``_release_chat_backend_for_diffusion`` for direct backend callers. ``DiffusionBackend.load_model`` now raises RuntimeError when the safetensors tracker still owns a previously-resident name after the unload, matching the route-level helper. The route layer's existing classifier maps the new wording to HTTP 503. P1 #3: ``DiffusionBackend.load_model`` now preflights the full diffusers repo (or explicit GGUF ``base_repo``) via ``hf_hub_download(filename="model_index.json")`` BEFORE the chat / export unload runs. The GGUF path was already covered by the existing ``hf_hub_download(gguf_filename)`` round-trip; the full-repo path used to skip validation and let a typo / private / gated repo only surface inside ``from_pretrained`` AFTER the user's chat model was already dropped. Local paths are checked structurally (must be a directory containing ``model_index.json``) so we do not network-round-trip for an on-disk miss. Error messages route through ``_display_repo_id`` so an absolute filesystem path does not leak the operator's layout. P1 #6: ``/api/inference/unload`` (the direct chat unload endpoint) now treats ``unload_model() -> False`` AND a leftover state (``is_loaded`` / ``is_active`` / ``loading_model_identifier`` for GGUF, ``active_model_name`` / ``loading_models`` for safetensors) as 503 instead of unconditionally responding ``status="unloaded"``. The UI used to show the model as gone while the backend still owned VRAM. P2 #7: extended the /images/load RuntimeError -> HTTPException marker list with ``still active or loading after unload`` and ``still loading after unload``. Round 18 introduced these exact phrasings on the backend side; without the extension a retryable unload failure was returning HTTP 400 to the user instead of 503. P2 #8: removed the unused ``unsloth_backend = get_inference_backend()`` eager construction in the GGUF chat-load branch. Eager construction made the GGUF-only path needlessly fail or pay startup cost when the safetensors backend was unavailable / lazy; ``_release_safetensors_chat_for`` already handles that case as a no-op. All 85 diffusion-relevant + 98 related backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_preflight_full_diffusers_repo(effective_base, hf_token)`` now runs for every load mode, including the GGUF-with-auto-base path. Round 19 only preflighted the full repo or an explicit ``base_repo``, so an auto-picked companion that turned out to be gated / private / missing still unloaded the user's chat model before ``from_pretrained`` failed. ``effective_base`` is the same value that feeds every downstream allocation, so preflighting it unconditionally catches all three modes. P1 #2: ``diffusers.GGUFQuantizationConfig`` (which imports the ``gguf`` package at construction time) is now built up front, inside the same try block that surfaces "Re-run Studio setup". Previously the missing-dependency exception fired AFTER ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` had already taken the chat / export models down. The downstream from_single_file call reuses the same ``quant_config`` reference. P1 #4: ``studio/backend/requirements/studio.txt`` now lists ``diffusers>=0.37.0`` and ``gguf>=0.10.0``. These were only in the extras files, so fresh standard Studio installs failed on /images/load with the round 20 P1 #2 dependency error message. P1 #5: ``LoadRequest``, ``UnloadRequest``, and ``ValidateModelRequest`` now apply the same control-character + embedded-HF-token validators that ``DiffusionLoadRequest`` already had. /api/inference/load, /api/inference/validate, and /api/inference/unload used to accept newline / tab / control characters in ``model_path`` (log-line smuggling) and URL-form ``https://hf_xxxxx@huggingface.co/...`` (credential leak through structured log sinks). P2 #6: ``_collapse_local`` in the diffusion load-error scrubber now resolves relative candidates and adds the absolute form to the substring set. A relative ``exports/my-flux`` used to leak ``/mnt/disks/.../exports/my-flux/...`` via downstream library errors because the scrubber only matched the original literal. Replacement is longest-first so a leaf-only context survives. All 85 diffusion-relevant + 35 related model-validation tests pass locally. (P1 #3 cross-workload GPU handoff lock is deferred: deserves a focused design pass across /images/load, /chat/load (both branches), /training/start, and /export/load to pick a lock boundary that does not deadlock against the backend load locks or stall the SSE log stream.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1 + #2: ``LoadRequest._no_embedded_hf_tokens`` and ``ValidateModelRequest._no_embedded_hf_tokens`` now cover ``gguf_variant`` in addition to ``model_path``. A caller could pass a variant like ``Q4_K_M-hf_xxxxxxxx`` that flowed into structured log sinks via the GGUF resolver path; the matching ``DiffusionLoadRequest`` validator already covered every string field, so this restores parity. P1 #3: ``/api/inference/unload`` now also matches the llama ``loading_model_identifier`` when picking the GGUF branch. A pending GGUF download (``is_active`` still False, ``loading_model_identifier`` populated) used to fall through to the safetensors branch and respond ``status="unloaded"`` while llama-server kept downloading. P1 #4 + #5: the final safetensors-handoff sweeps (route-level ``_release_safetensors_chat_for`` and backend ``_release_chat_backend_for_diffusion``) now check ``active_model_name`` and ``loading_models`` WITHOUT the initial ``owned_names`` filter. A concurrent ``/load`` that landed AFTER the snapshot was previously ignored, so a chat model that began loading during the unload window let training / export / GGUF chat / diffusion start anyway and race the new chat for VRAM. P2 #6: added ``_preflight_diffusers_subfolder_config`` and invoked it for GGUF loads with a transformer class (``effective_base``, ``"transformer"``). A custom base companion that had ``model_index.json`` but lacked ``transformer/config.json`` previously passed the round 19 preflight, unloaded chat, then failed inside ``from_single_file``. P2 #7: ``_scrub_validation_obj`` in main.py also scrubs string dict KEYS. Pydantic ``string_type`` errors surface ``input`` verbatim, and a malformed payload like ``{"repo_id": {"hf_xxxxx": "owner/repo"}}`` would otherwise leak the token through the 422 response body. All 85 diffusion-relevant + 35 model-validation tests pass locally. Existing fakes for ``hf_hub_download`` updated to accept the new ``subfolder=`` kwarg the round 21 preflight uses. (P1 #3 cross-workload GPU handoff lock from round 20 is still deferred; round 21's P1 #4 / #5 raised the sweep-level guarantee, which closes the most common race without the deadlock risk of holding a process-wide lock across the entire load.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``TrainingStartRequest.model_name`` now runs the same control-character and embedded-HF-token validators that the chat and diffusion request models gained in rounds 5 / 15 / 20 / 21. ``/api/training/start`` previously accepted newline / tab / control characters and URL-form ``hf_xxxxx`` tokens that flowed into structured-log sinks via "Loading model %s" lines. P1 #2: ``_run_with_helper`` in ``utils/datasets/llm_assist.py`` now skips the helper GGUF when the diffusion image backend reports loaded / loading. The public chat / training / export routes already do this through ``_release_diffusion_for``, but this dataset-side helper loaded llama-server directly with no diffusion guard, so an Images-page allocation would race the helper for VRAM. New ``_diffusion_image_model_busy`` helper fails closed (treats status() failure as busy) so the resident image model is preserved instead of being overwritten. P1 #3: same ``_diffusion_image_model_busy`` guard added to ``_run_multi_pass_advisor`` (the dataset conversion advisor), which has the same direct llama.cpp load shape. P2 #4: the early "Could not infer a diffusion family" RuntimeError now routes ``repo_id`` through ``_display_repo_id`` before formatting. A local absolute path that did not match any known family used to leak the operator's filesystem layout via the 400 response body, last_error, and log line. All 97 diffusion + training-validation + related tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1 + #2 + #6: extended the chat / diffusion / training identifier hardening to every export-side request model. ExportCommonOptions (parent of ExportMergedModelRequest / ExportBaseModelRequest / ExportLoRAAdapterRequest) now applies _no_control_chars and _reject_embedded_hf_token to repo_id and base_model_id; ExportGGUFRequest gets the same on its repo_id plus a control-char check on quantization_method; and LoadCheckpointRequest validates checkpoint_path. Previously "/api/export/*" accepted newline-smuggled identifiers and URL-form ``hf_xxxxx`` tokens that flowed into log lines. P1 #3 + #4: ``_run_with_helper`` and ``_run_multi_pass_advisor`` now use a shared ``_gpu_workload_busy_for_helper`` that gates on diffusion (round 22 already), training, AND export. The round 22 guard only checked diffusion, so the dataset helper / advisor could still load llama-server on top of an active training run or a resident export checkpoint. Each step fails closed (unverifiable status counts as busy) so the user's primary workload is preserved. P1 #5: PublishDatasetRequest in models/data_recipe.py also applies the identifier hardening to repo_id; the publish path previously accepted control characters and URL-form tokens. P1 #7-10: added _validate_logged_identifier helper to routes/models.py and applied it to the path / query parameter endpoints that flow into logger.info(...) calls -- ``/config/{model_name}``, ``/check-vision/{model_name}``, ``/check-embedding/{model_name}``, ``/gguf-variants``. Mapped the validator's ValueError to HTTP 422 so the client sees the same shape as a Pydantic validation failure. P2 #11 + #12: ``Loading diffusion model %s`` and ``Diffusion load failed for %s`` log lines route ``repo_id`` / ``effective_base`` through ``_display_repo_id`` (collapses absolute local paths to the leaf, still scrubs HF tokens) instead of plain ``_redact_hf_tokens``. The error path was already collapsed in the user-facing 400 / RuntimeError, but the structured-log lines kept the full path. All 97 diffusion + training-validation + related tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Twelve P1 findings from round 26 reviewer aggregate, plus the CI revert of round 25 P1 #5 to a less invasive location. 1. requirements/studio.txt + requirements/single-env/constraints.txt: revert the round 25 huggingface-hub bump (broke Studio Update CI, Mac Studio Update CI, Mac Studio UI CI, Studio UI CI all with ResolutionImpossible against transformers==4.57.6 which requires hub<1.0). Standard install path stays on the well-tested 4.57.6 + 0.36.2 + trl 0.23.1 trio. 2. requirements/no-torch-runtime.txt + pyproject.toml [huggingfacenotorch]: bump huggingface_hub floor from >=0.34.0 to >=1.3.0,<2.0 -- this is where the actual transformers 5.x + hub 0.36.2 broken combo can land because the file installs --no-deps. transformers 5.x calls hub.is_offline_mode which only exists in hub 1.x. 3. utils/datasets/llm_assist.py: revert round 25 P1 #4 (helper/advisor sharing the global llama backend) which introduced three regressions: a chat-evict load race after the busy precheck, a finally-block that could unload a user chat model, and an identifier mismatch the delete guard could not canonicalize. Go back to PRIVATE LlamaCppBackend instances and expose the active helper/advisor repos through a new thread-safe registry (helper_advisor_owns_repo / _register_helper_advisor_repo / _unregister_helper_advisor_repo) so DELETE /api/models/delete-cached can still block the rmtree. 4. routes/models.py delete_cached_model: check the new helper/advisor registry up front and 409 if a helper/advisor still owns the target repo. Closes round 26 P1 #13 and #14 (helper/advisor identifiers were prefixed and would never equal the raw repo id). 5. routes/models.py get_lora_base_model: validate lora_path with _validate_logged_identifier before it is reflected in 404 detail and error logs (round 26 P1 #12). 6. routes/inference.py /unload: round 21 P1 #3 added a "or not is_loaded" fallback that let an unload of owner/B cancel a pending llama load of owner/A. Replace it with a narrow llama_is_starting_without_identifier branch that only fires when llama-server is mid-startup with neither identifier set (round 26 P1 #5). 7. routes/inference.py /unload: poll loading_model_identifier for up to 5 s after asyncio.to_thread(unload_model) so a legitimate pending-load cancel does not 503 because the load thread has not yet observed _cancel_event in its finally (round 26 P2 #15). 8. models/training.py TrainingStartRequest: extend identifier hardening to hf_dataset, subset, train_split, eval_split. Round 22 only guarded model_name (round 26 P1 #10). 9. models/data_recipe.py SeedInspectRequest: add _no_control_chars + _reject_embedded_hf_token field_validators on dataset_name (round 26 P1 #11). Tests: 105 targeted (diffusion + cached_gguf + llama_cpp_cache + inference_model_validation + models_get_model_config) and 1768 broader backend tests pass locally. Pre-existing test_desktop_auth.py, test_studio_api.py, and test_training_worker_flash_attn.py failures reproduce on HEAD without these changes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Twelve actionable P1/P2 findings from round 28 reviewer aggregate. Skipped #3 (studio.txt huggingface-hub bump) because the empirical CI evidence in round 26 contradicts that suggestion: bumping the pin there breaks installs that apply constraints.txt (transformers==4.57.6 requires hub<1.0). The actual broken combo only happens via the --no-deps no-torch path which is already bumped in no-torch-runtime.txt and pyproject.toml huggingfacenotorch. 1. utils/datasets/llm_assist.py: split _HELPER_ADVISOR_REFCOUNT into CACHE vs GPU counters. helper_advisor_owns_repo (used by delete-cache) reads CACHE; helper_advisor_busy (used by public handoffs) reads GPU. precache_helper_gguf now registers with gpu_owner=False so a background pre-cache download does not 503 every chat / training / export / diffusion load. 2. utils/datasets/llm_assist.py: introduce _HELPER_ADVISOR_START_LOCK and wrap the busy precheck + register pair in _run_with_helper and _run_multi_pass_advisor. Two concurrent helper / advisor invocations could both pass _gpu_workload_busy_for_helper before either registered, then OOM each other. 3. utils/datasets/llm_assist.py: _gpu_workload_busy_for_helper now also returns True when another helper/advisor already holds the private LlamaCppBackend. 4. routes/inference.py: add _raise_if_helper_advisor_busy(workload) that 503s when AI Assist owns the GPU. Wire it into both chat load branches (GGUF + safetensors) BEFORE the existing _release_export_for / _release_diffusion_for calls so we do not first tear down an idle export / diffusion just to fail on the helper check. 5. routes/training.py + routes/export.py + diffusion.load_model: call the helper-busy check FIRST before any release helper fires. Mirrors the chat-load ordering. 6. routes/inference.py _release_llama_for: poll loading_model_identifier for up to 5 s after unload_model() so a cancelled pending GGUF download has time to clear its identifier. Mirrors the same wait round 26 added to the explicit /api/inference/unload route. 7. core/inference/diffusion.py _release_chat_backend_for_diffusion: same 5 s settling wait for cancelled pending GGUF downloads. 8. models/inference.py LoadRequest: validate every llama_extra_args entry through _no_control_chars + _reject_embedded_hf_token. The list was forwarded verbatim to a logged llama-server command line, so a smuggled control char or hf_... token would land in logs and subprocess args. 9. routes/models.py /gguf-download-progress: apply _validate_logged_identifier to repo_id and variant, matching the round 24 hardening on the adjacent generic /download-progress. 10. routes/inference.py diffusion-load RuntimeError classifier: treat "AI Assist ..." messages as retryable 503 instead of 400 (round 28 P2 #15). Mirrors the round 18/19 markers for chat unload failures. Tests: 105 targeted + 1768 broader backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Four actionable findings from round 30. Skipped P1 #1 / #2 / #3 (huggingface-hub bump in studio.txt / single-env / colab-new) because the live B200 Studio that successfully generated FLUX.2 klein images runs the exact combo the reviewer flags as broken: huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 Flux2KleinPipeline: True (imports cleanly) The is_offline_mode ImportError only fires with transformers 5.x, and the standard install path pins transformers==4.57.6 via constraints. The round 26 fix bumped no-torch-runtime.txt + pyproject huggingfacenotorch where the --no-deps install path can land on transformers 5.x; that remains the correct surface. 1. core/inference/diffusion.py: preflight transformers + accelerate via importlib.util.find_spec BEFORE any destructive GPU-owner unload. Diffusers can expose stub pipeline classes when transformers / accelerate are missing, so the load used to drop chat first and fail later inside from_pretrained. find_spec keeps existing tests that stub these modules passing because no real module is executed (round 30 P1 #11). 2. models/export.py ExportGGUFRequest.quantization_method: extend the embedded HF token validator to this field too. Round 23 added the control-char guard but not the token guard; the value is forwarded into worker command lines and reflected in error / success text (round 30 P1 #5). 3. models/data_recipe.py SeedInspectUploadRequest: add _no_control_chars + _reject_embedded_hf_token field_validators to filename and to each entry of file_names. Mirrors the sibling SeedInspectRequest.dataset_name hardening (round 30 P1 #6). 4. frontend/src/features/images/images-page.tsx: defer the initial refreshStatus() call via queueMicrotask so the synchronous setRefreshingStatus(true) inside it does not trip the react-hooks/set-state-in-effect lint on mount (round 30 P2 #12). Deferred (need larger surgery / out of scope for this round): P1 #4 native_path_lease for diffusion local-path loads P1 #7-#10 helper/advisor + public-start window mutual lock symmetry Tests: 98 targeted (diffusion + cached_gguf + inference_validation) pass locally; frontend npm run typecheck passes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Addresses remaining round-30 reviewer findings against PR unslothai#5754 (diffusion image generation in Unsloth Studio). The studio.txt / constraints.txt / colab-new hub-bump items (round 30 #1-#3) are intentionally skipped: the live B200 Studio install path with huggingface_hub==0.36.2, transformers==4.57.6 and diffusers==0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation (see staging CI green on bec81b8 plus round 28-30 local validation suites). The is_offline_mode ImportError the reviewer cites only triggers with transformers 5.x against huggingface_hub 0.x; the constraints pin holds transformers at 4.x so the combo never materialises on the standard install path. Concurrency: close the helper / advisor GPU-start race in all four public load paths (round 30 P1 #7-#10). * Add a _PUBLIC_LOAD_PENDING_COUNT counter in utils/datasets/llm_assist.py, published under _HELPER_ADVISOR_START_LOCK by _raise_if_helper_advisor_busy and cleared by a paired _clear_public_load_window in routes/inference.py. A concurrent helper / advisor start now sees public_load_pending() inside _gpu_workload_busy_for_helper and refuses VRAM until the public load attempt finishes, closing the window between the busy snapshot and the public load flipping its public ownership flags (is_loaded, current_checkpoint, is_training_active, etc.). * Wire the paired clear into all five call sites (GGUF chat, safetensors chat, diffusion image load, training start, export load-checkpoint). The chat path tracks the published tag in a local so the finally clears the same counter on either branch or on early HTTPException. Security: gate /api/inference/images/load against arbitrary local-path probes (round 30 P1 #4). Mirror the chat /api/inference/load native_path_lease boundary so an authenticated session cannot use repo_id or base_repo as a directory probe. * Add native_path_lease + base_repo_native_path_lease to DiffusionLoadRequest (optional; Hub ids skip the lease). * Add _looks_like_local_diffusion_path + a _resolve_diffusion_repo_for_request helper that requires a verified directory-typed native path grant for any value that starts with /, ~, ./, ../, contains a backslash, or expands to an absolute path. The detector deliberately avoids Path.exists so the route does not side-channel filesystem layout via differential error messages. Frontend: split the Images page status fetch from the spinner toggle (round 30 P2 #12). The mount effect and the is_loading auto-poll now call a setState-free fetchAndUpdateStatus; the user-driven Refresh button still calls refreshStatus to flip the spinner. Cleaner separation than the queueMicrotask shim from the prior commit; the eslint react-hooks/set-state-in-effect rule is not in the studio-frontend-ci typecheck gate, and the codebase already has hundreds of pre-existing violations of the same rule. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor). Frontend typecheck passes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Addresses remaining round-30 reviewer findings against PR unslothai#5754 (diffusion image generation in Unsloth Studio). The studio.txt / constraints.txt / colab-new hub-bump items (round 30 #1-#3) are intentionally skipped: the live B200 Studio install path with huggingface_hub==0.36.2, transformers==4.57.6 and diffusers==0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation (see staging CI green on bec81b8 plus round 28-30 local validation suites). The is_offline_mode ImportError the reviewer cites only triggers with transformers 5.x against huggingface_hub 0.x; the constraints pin holds transformers at 4.x so the combo never materialises on the standard install path. Concurrency: close the helper / advisor GPU-start race in all four public load paths (round 30 P1 #7-#10). * Add a _PUBLIC_LOAD_PENDING_COUNT counter in utils/datasets/llm_assist.py, published under _HELPER_ADVISOR_START_LOCK by _raise_if_helper_advisor_busy and cleared by a paired _clear_public_load_window in routes/inference.py. A concurrent helper / advisor start now sees public_load_pending() inside _gpu_workload_busy_for_helper and refuses VRAM until the public load attempt finishes, closing the window between the busy snapshot and the public load flipping its public ownership flags (is_loaded, current_checkpoint, is_training_active, etc.). * Wire the paired clear into all five call sites (GGUF chat, safetensors chat, diffusion image load, training start, export load-checkpoint). The chat path tracks the published tag in a local so the finally clears the same counter on either branch or on early HTTPException. Security: gate /api/inference/images/load against arbitrary local-path probes (round 30 P1 #4). Mirror the chat /api/inference/load native_path_lease boundary so an authenticated session cannot use repo_id or base_repo as a directory probe. * Add native_path_lease + base_repo_native_path_lease to DiffusionLoadRequest (optional; Hub ids skip the lease). * Add _looks_like_local_diffusion_path + a _resolve_diffusion_repo_for_request helper that requires a verified directory-typed native path grant for any value that starts with /, ~, ./, ../, contains a backslash, or expands to an absolute path. The detector deliberately avoids Path.exists so the route does not side-channel filesystem layout via differential error messages. Frontend: split the Images page status fetch from the spinner toggle (round 30 P2 #12). The mount effect and the is_loading auto-poll now call a setState-free fetchAndUpdateStatus; the user-driven Refresh button still calls refreshStatus to flip the spinner. Cleaner separation than the queueMicrotask shim from the prior commit; the eslint react-hooks/set-state-in-effect rule is not in the studio-frontend-ci typecheck gate, and the codebase already has hundreds of pre-existing violations of the same rule. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor). Frontend typecheck passes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Two universal-consensus round-31 reviewer findings. Concurrency: /images/load was leaking the public-load pending counter on any pre-finally HTTPException (round 31 P1 #1, 11/12 votes). _raise_if_helper_advisor_busy("diffusion") published the counter, then _resolve_diffusion_repo_for_request ran outside the clearing try/finally. A request like repo_id="/tmp/model" with no native_path_lease returned 400 and left public_load_pending() true until process restart, permanently blocking AI Assist. Fix mirrors the training / export pattern: track diffusion_load_window_published in an outer try, publish the flag right after the helper-busy check succeeds, and clear in an outer finally that only fires when the flag is set. This also closes round 31 P1 #6: a second request's failure can no longer decrement a still-active first request's counter, because the second request has not yet flipped its own publish flag. Security: _looks_like_local_diffusion_path missed cwd-relative directories (round 31 P1 #2, 8/12 votes). DiffusionBackend. load_model accepts repo_id="exports/my-flux" as a local directory via Path(repo_id).expanduser().is_dir(), but the detector only flagged values starting with /, ~, ./, ../, backslash, or absolute. Tightened the detector to also reject: * weight-file suffixes (.gguf / .safetensors / .bin / .pt / .pth) * non-2-segment values (`owner`, `a/b/c`, `owner/`, `/repo`, `//`) * 2-segment values whose parts are `.` or `..` * 2-segment values that actually resolve to an existing local path under backend CWD (last-resort exists() probe). The existence probe is a minor side-channel for an already- authenticated caller, accepted in exchange for closing the silent bypass of the new lease boundary. Valid Hub ids like unsloth/FLUX.2-klein-base-4B-GGUF, microsoft/Phi-3.5-mini-instruct still pass through unchanged. Skipped (consistent with prior rounds): * R31 P1 #3 (Tauri / native lease enum missing `load-diffusion-model` op): architectural surface; defer until the Images page actually surfaces a local-path picker. * R31 P1 #4-#5, #8: studio.txt / constraints.txt / pyproject hub pins. Live B200 install path with huggingface_hub==0.36.2, transformers==4.57.6, diffusers==0.37.1 imports Flux2KleinPipeline cleanly. The is_offline_mode import error only triggers when transformers 5.x is paired with hub 0.x, which the constraints pin prevents. * R31 P1 #7 (find_spec vs real import): a full transformers import at module load breaks tests that stub huggingface_hub; find_spec is the existing tradeoff. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor).
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Three round-32 reviewer findings, plus documentation cleanup for the local-path Tauri/FE plumbing gap. Concurrency: direct DiffusionBackend.load_model callers now publish the helper/advisor pending marker symmetrically (round 32 P1 #3). _raise_if_helper_advisor_busy_for_diffusion gains an optional publish_pending flag; load_model passes True so the destructive unload window is gated by a "diffusion-backend" tag published under _HELPER_ADVISOR_START_LOCK. The route layer's "diffusion" tag and the backend's "diffusion-backend" tag refcount independently (sum > 0 still blocks helper starts), so neither side's clear can erase the other's still-active marker. The existing _release_chat_backend_for_diffusion(check_helper_advisor= True) path stays snapshot-only (publish_pending defaults False) so test / direct callers of that helper do not leak a counter. Validation: export save_directory now rejects ALL ASCII control characters (round 32 P1, save_directory tab finding). The earlier CR / LF only guard missed TAB / VT / FF / DEL, which a caller could smuggle past the export worker's logged subprocess argv. Documentation: DiffusionLoadRequest.repo_id and base_repo updated to reflect that local-path support is gated on a Tauri / frontend load-diffusion-model directory lease producer that has not shipped yet (round 32 P1 #1 from multiple reviewers). The backend lease boundary is correct; what is missing is the FE / native side that mints the matching grant. Until that lands, local paths through the Images route always 400 with "Native path grant is required", which the docstring now spells out. Skipped (consistent with prior rounds): * Hub-pin findings (R32 P1 #4-#6): live B200 install with huggingface_hub==0.36.2 + transformers==4.57.6 + diffusers== 0.37.1 verifiably imports Flux2KleinPipeline. Empirical justification documented in R30 / R30 follow-up commit msgs. * Tauri / native enum surgery (R32 P1 #1, 6 votes): real architectural work but out of scope for this PR's Python surface. Documented now; FE / Rust ticket to follow. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor).
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Two round-33 reviewer findings: hub-floor consistency and the multipart upload filename validator gap. Dependencies: reverted the round-26 huggingface_hub>=1.3.0 floor in no-torch-runtime.txt and pyproject.toml (round 33 P1 #1-#5, 4/12 vote consensus). studio.txt forces huggingface_hub==0.36.2 to match the transformers==4.57.6 pin in extras-no-deps.txt, so the 1.3.0 floor was internally inconsistent. Reviewers reproduced the resolver conflict on a fresh install. Empirical justification (re-verified on the live B200 host before the revert): huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation. transformers 4.57.6 carries its own transformers.utils.hub.is_offline_mode and does not actually need huggingface_hub.is_offline_mode at import time. The original bump was guarding against the (never-realised) transformers 5.x path, which extras-no-deps explicitly pins away. Validation: multipart /seed/upload-unstructured-file now applies the same _no_control_chars and _reject_embedded_hf_token checks to file.filename that SeedInspectUploadRequest.filename already applies in the JSON variant (round 33 P1 #7). The filename is reflected back to the client, persisted in the per-file meta JSON, and echoed by error responses, so the JSON-side hardening must not be asymmetric with the multipart path. Skipped (consistent with prior rounds): * Find_spec vs full import (R33 P1 #6): preserves test compatibility with the huggingface_hub stub fixture. * React hooks set-state-in-effect lint (R33 P1 #8): codebase has 146 pre-existing violations of the same rule; studio-frontend-ci does not gate on lint. * Direct DiffusionBackend.load_model bypass (R33 P1 #9): the route is the only production entry point, and the backend helper now publishes its own diffusion-backend pending tag (round 32 P1 #3). Direct-caller hardening would require duplicating the lease check into load_model itself, which is out of scope for the route-layer security boundary. * One-segment Hub IDs (R33 P2 #10): strict 2-segment Hub id check is intentional; one-segment names are not valid Hub ids. * Cwd-relative shadow of Hub IDs (R33 P2 #11): documented side-channel tradeoff accepted in round 31 commit msg. 97 targeted backend tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates the requirements on datasets, setuptools, setuptools-scm, pandas, huggingface-hub, transformers, trl, data-designer-engine, pytest, pytest-rerunfailures, scikit-learn, torchao, chardet, faker, fsspec, python-json-logger, sqlfluff, data-designer and data-designer-config to permit the latest version.
Updates
datasetsto 4.5.0Release notes
Sourced from datasets's releases.
Commits
69d773aRelease: 4.5.0 (#7944)dc98f97Add _generate_shards (#7943)38d28bfadd _OverridableIOWrapper (#7942)7431153Fix method to retrieve attributes from file object (#7938)6a1bc35fix low but large example indexerror (#7912)7bdf840Raise early for invalidrevisioninload_dataset(#7929)06b6e02Add lance format support (#7913)0feb65dset dev version (#7908)37d9615release: 4.4.2 (#7907)58dda42Don't save original_shard_lengths by default for backward compat (#7906)Updates
setuptoolsfrom 80.9.0 to 82.0.1Changelog
Sourced from setuptools's changelog.
... (truncated)
Commits
5a13876Bump version: 82.0.0 → 82.0.151ab8f1Avoid using (deprecated) 'json.version' in tests (#5194)f9c37b2Docs/CI: Fix intersphinx references (#5195)8173db2Docs: Fix intersphinx references09bafbcFix past tense on newsfragment461ea56Add news fragmentc4ffe53Avoid using (deprecated) 'json.version' in tests749258bCleanuppkg_resourcesdependencies and configuration (#5175)2019c16Parseext-module.define-macrosfrompyproject.tomlas list of tuples (#5169)b809c86Sync setuptools schema with validate-pyproject (#5157)Updates
setuptools-scmfrom 9.2.0 to 9.2.2Changelog
Sourced from setuptools-scm's changelog.
Commits
e56b78fMerge pull request #1232 from RonnyPfannschmidt/fix-1231-dont-warn-when-no-guess4f55e95docs: update changelog for v9.2.2 patch release95a0c47fix: don't warn about tool.setuptools.dynamic.version when only using file fi...338f562Merge pull request #1226 from RonnyPfannschmidt/prepare-releasea893634Prepare release v9.2.1ad83282Merge pull request #1225 from pypa/pre-commit-ci-update-config20a4464[pre-commit.ci] pre-commit autoupdate70f6942Merge pull request #1219 from RonnyPfannschmidt/fix-1216-explicitly-deprecate...14d85c0Install Mercurial on Windows runners via Chocolatey8c5cec9Fix API stability check workflow to install griffe and improve reportingUpdates
pandasto 3.0.2Release notes
Sourced from pandas's releases.
Commits
ab90747RLS: 3.0.2 (#64934)6f27013Backport PR #64931 on branch 3.0.x (DOC/BLD: temporary disable upload of docs...48ddc60Backport PR #64664 on branch 3.0.x (BUG: DataFrame.sum() crashes on empty Dat...8774488[backport 3.0.x] PERF: fix slow python loop in validation for ArrowStringArra...33af6ccBackport PR #64133 on branch 3.0.x (BUG: str.find returns byte offset instead...4ef49d8[backport 3.0.x] BUG: fix convert_dtypes dropping values from sliced mixed-dt...0668f34[backport 3.0.x] BUG: Fix HDFStore.put with StringDtype columns and compressi...23f2f44[backport 3.0.x] BUG: Suppress unnecessary RuntimeWarning in to_datetime with...83ba804Backport PR #64886: BUG: Compute Variance of Complex Numbers Correctly (#64892)bb5ca1aBackport PR #64386 on branch 3.0.x (BUG: fix sort_index AssertionError with R...Updates
datasetsfrom 4.3.0 to 4.8.4Release notes
Sourced from datasets's releases.
Commits
69d773aRelease: 4.5.0 (#7944)dc98f97Add _generate_shards (#7943)38d28bfadd _OverridableIOWrapper (#7942)7431153Fix method to retrieve attributes from file object (#7938)6a1bc35fix low but large example indexerror (#7912)7bdf840Raise early for invalidrevisioninload_dataset(#7929)06b6e02Add lance format support (#7913)0feb65dset dev version (#7908)37d9615release: 4.4.2 (#7907)58dda42Don't save original_shard_lengths by default for backward compat (#7906)Updates
huggingface-hubfrom 0.36.2 to 1.9.0Release notes
Sourced from huggingface-hub's releases.
... (truncated)
Commits
b768bb2Release: v1.9.09d30ff2Release: v1.9.0.rc0657b8b9chore: remove claude.yml workflow file (#4031)38d48d9[CLI] Migratemodels,datasets,spaces,paperstooutsingleton (#4...4e2337d[CLI] enrich CLI errors with available options and commands (#4034)ea1f4b7Support volumes at repo creation and duplication (#4035)993d645[FEAT] Support skills from hf skills (#3956)bb7dc6eAddHF_HUB_DISABLE_SYMLINKSenv variable to force no-symlink cache (#4032)2593ff8Do not scan CACHEDIR.TAG file in cache (#4036)b8d92a2[Fix] Validate shard filenames in sharded checkpoint index files (#4033)Updates
transformersfrom 4.57.6 to 5.5.0Release notes
Sourced from transformers's releases.
... (truncated)
Commits
c1c3424update20bff68update release workflow8956441v5.5.05135e5ecasually dropping the most capable open weights on the planet (#45192)a594e09Internalise the NomicBERT model (#43067)4932e97Fix resized LM head weights being overwritten by post_init (#45079)57e8413[Qwen3.5 MoE] Add _tp_plan to ForConditionalGeneration (#45124)b10552eFix TypeError: 'NoneType' object is not iterable in GenerationMixin.generate ...423f2a3fix(models): Fix dtype mismatch in SwitchTransformers and TimmWrapperModel (#...ade7a05Generalize gemma vision mask to videos (#45185)Updates
trlfrom 0.23.1 to 1.0.0Release notes
Sourced from trl's releases.
... (truncated)
Commits
f3e9ac1Release: v1.0 (#5409)e8d5dfcAdd second version of Qwen 3.5 chat template to chat_template_utils (#5405)71ff6a2Add HF_TOKEN environment variable to workflow files (#5397)1ee3975Add vLLM inference to the Base Self-Distillation Trainer (#5388)79e6e79Movedisable_config=TruefromgeneratetoGenerationConfig(#5384)83d68ddchore: updatepr_template_check.yml(#5393)4cb7ab1Enhance PR template check to exclude reopened PRs from first-time contributor...32a40bfEnforce PR template for first-time contributors and document AI usage policy ...8e69b68Mark test_rloo[fsdp2] as xfail for transformers 5.4.0 (#5387)c264266Remove deprecatedTRACKIO_SPACE_IDenv var from all scripts (#5365)Updates
data-designer-enginefrom 0.5.4 to 0.5.5Updates
pandasfrom 2.3.3 to 3.0.2Release notes
Sourced from pandas's releases.
Commits
ab90747RLS: 3.0.2 (#64934)6f27013Backport PR #64931 on branch 3.0.x (DOC/BLD: temporary disable upload of docs...48ddc60Backport PR #64664 on branch 3.0.x (BUG: DataFrame.sum() crashes on empty Dat...8774488[backport 3.0.x] PERF: fix slow python loop in validation for ArrowStringArra...33af6ccBackport PR #64133 on branch 3.0.x (BUG: str.find returns byte offset instead...4ef49d8[backport 3.0.x] BUG: fix convert_dtypes dropping values from sliced mixed-dt...0668f34[backport 3.0.x] BUG: Fix HDFStore.put with StringDtype columns and compressi...23f2f44[backport 3.0.x] BUG: Suppress unnecessary RuntimeWarning in to_datetime with...83ba804Backport PR #64886: BUG: Compute Variance of Complex Numbers Correctly (#64892)bb5ca1aBackport PR #64386 on branch 3.0.x (BUG: fix sort_index AssertionError with R...Updates
pytestto 9.0.2Release notes
Sourced from pytest's releases.
Commits
3d10b51Prepare release version 9.0.2188750bMerge pull request #14030 from pytest-dev/patchback/backports/9.0.x/1e4b01d1f...b7d7befMerge pull request #14014 from bluetech/compat-notebd08e85Merge pull request #14013 from pytest-dev/patchback/backports/9.0.x/922b60377...bc78386Add CLI options reference documentation (#13930)5a4e398Fix docs typo (#14005) (#14008)d7ae6dfMerge pull request #14006 from pytest-dev/maintenance/update-plugin-list-tmpl...556f6a2pre-commit: fix rst-lint after new release (#13999) (#14001)c60fbe6Fix quadratic-time behavior when handlingunittestsubtests in Python 3.10 ...73d9b01Merge pull request #13995 from nicoddemus/patchback/backports/9.0.x/1b5200c0f...Updates
pytest-rerunfailuresfrom 15.1 to 16.1Changelog
Sourced from pytest-rerunfailures's changelog.
Commits
b015092Preparing release 16.1c1666ddPrepare release.8d04ad9FixNotImplementedErrorcrash when using xdist schedulers without `mark_tes...cb8ede7Add a--force-rerunsto override rerun count globally (#307)5e01132Bump actions/setup-python from 5 to 6 in the actions group (#310)88e0023Drop support for Python 3.9. (#308)df47974Change 'localhost' to '127.0.0.1' (#305)f149c7dBack to development: 16.1f97618fPreparing release 16.0.1c60d17dPrepare release.Updates
scikit-learnfrom 1.7.1 to 1.8.0Release notes
Sourced from scikit-learn's releases.
Commits
646da0f[cd build]4f4f283Generate changelog967dcdeSet versioncb1424bDOC Release highlights for 1.8 (#32809)5645b27🔒 🤖 CI Update lock files for main CI build(s) 🔒 🤖 (#32859)6b9fb11🔒 🤖 CI Update lock files for free-threaded CI build(s) 🔒 :rob...a0f6d88🔒 🤖 CI Update lock files for array-api CI build(s) 🔒 🤖 ...c1de8fcFIX Makeget_namespacehandle pandas dataframe input (#32838)764249aFix_safe_indexingwith non integer arrays on array API inputs (#32840)eca5e0aFIX Add new default max_samples=None in Bagging estimators (#32825)Updates
torchaofrom 0.14.0 to 0.17.0Release notes
Sourced from torchao's releases.
... (truncated)
Commits
Updates
chardetto 7.4.0.post2Changelog
Sourced from chardet's changelog.