[tests] security: default Studio host to 127.0.0.1 and prompt before auto-start#8
[tests] security: default Studio host to 127.0.0.1 and prompt before auto-start#8danielhanchen wants to merge 3 commits into
Conversation
Closes unslothai#4684. Previously Unsloth Studio bound to 0.0.0.0 (all interfaces) by default and the installer silently auto-started a server at the end of install, contradicting the documented "privacy first / 100% offline and locally" guarantee and exposing the service on the network without user consent. Changes: - studio/backend/run.py: change run_server() default host to 127.0.0.1; extract argparse setup into _make_argument_parser() for testability - unsloth_cli/commands/studio.py: change typer.Option default to 127.0.0.1 - install.sh: remove -H 0.0.0.0 from generated launch-studio.sh launcher template; replace silent auto-start with a [Y/n] prompt - install.ps1: replace silent auto-start with a Read-Host [Y/n] prompt - README.md: simplify launch examples to `unsloth studio` (no -H flag); note that -H 0.0.0.0 is available for cloud/network use Users who need all-interface binding (cloud VMs, LAN sharing) can still pass -H 0.0.0.0 explicitly. No other logic was changed: _is_port_free(), startup_banner, and health-check paths all already handle 127.0.0.1 correctly. Tests (TDD — written before implementation): - studio/backend/tests/test_host_defaults.py: AST inspection of run_server() parameter default and argparse --host default - tests/studio/test_cli_studio_defaults.py: AST inspection of typer.Option default for studio_default() --host parameter - tests/sh/test_install_host_defaults.sh: static analysis of installer scripts and README launch section https://claude.ai/code/session_012umxRmBdeDV5U7Xhm1utu6
for more information, see https://pre-commit.ci
e9df706 to
2c5bf3a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e9df70654f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "launcher template: no hardcoded 'studio -H 0.0.0.0'" \ | ||
| "$_launcher" "studio -H 0.0.0.0" |
There was a problem hiding this comment.
Broaden insecure-host pattern to include --host form
This regression test only rejects the literal string studio -H 0.0.0.0. If install.sh, install.ps1, or README.md switches to the equivalent studio --host 0.0.0.0, these checks will pass even though the insecure all-interfaces default is still present. Because this script is meant to enforce the security default, the narrow match creates a false negative path.
Useful? React with 👍 / 👎.
| has_host_option = any( | ||
| isinstance(a, ast.Constant) and a.value in ("--host", "-H") | ||
| for a in default.args | ||
| ) | ||
| if has_host_option: |
There was a problem hiding this comment.
Validate host override behavior instead of option presence
This test is described as verifying that users can still override --host, but the implementation only checks for the existence of a call containing --host/-H. It would still pass if runtime validation later rejects values like 0.0.0.0, so it does not actually guard the behavior the docstring promises and can miss regressions in explicit override support.
Useful? React with 👍 / 👎.
| continue | ||
| args = func_node.args.args | ||
| defaults = func_node.args.defaults | ||
| offset = len(args) - len(defaults) |
| import ast | ||
| from pathlib import Path | ||
|
|
||
| import pytest |
| import ast | ||
| from pathlib import Path | ||
|
|
||
| import pytest |
2c5bf3a to
0d1d5d1
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d1d5d1e30
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| # The primary Launch example must show 'unsloth studio' without -H 0.0.0.0 | ||
| # (0.0.0.0 may appear later as an opt-in note, but not in the code block). | ||
| _readme_launch=$(awk '/#### Launch/{found=1} found{print} /^####/{if(!/#### Launch/)found=0}' "$README" | head -10) |
There was a problem hiding this comment.
Scan entire Launch section before asserting host safety
Limiting the README check to head -10 means the test can miss an insecure primary launch command if the section gains a few explanatory lines before the code block; in that case studio -H 0.0.0.0 could reappear below line 10 and this regression test would still pass. Since this script is intended to enforce a security default, truncating the section creates a false-negative path.
Useful? React with 👍 / 👎.
| echo "" | ||
| echo "=== install.sh end-of-install block ===" | ||
|
|
||
| _end=$(tail -40 "$INSTALL_SH") |
There was a problem hiding this comment.
Avoid fixed tail windows for end-of-install validation
Using tail -40 (and similarly tail -20 for PowerShell) makes the checks sensitive to file length rather than the actual install block boundaries, so adding unrelated lines at the end can move insecure launch commands out of the scanned window and let the test pass incorrectly. This weakens the regression guard for the installer’s default host behavior.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d1d5d1e30
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| assert_contains \ | ||
| "install.sh: interactive block prompts user (bash read)" \ | ||
| "$_end" "read" |
There was a problem hiding this comment.
Match
read as a command, not any substring
This check currently passes on any occurrence of the substring read, so the test can report success even when there is no interactive prompt command (for example, if comments/messages include words like already). That weakens the security regression guard this test is meant to provide. The assertion should match an actual shell read invocation (or the specific prompt text) rather than a raw substring.
Useful? React with 👍 / 👎.
| has_host_option = any( | ||
| isinstance(a, ast.Constant) and a.value in ("--host", "-H") | ||
| for a in default.args | ||
| ) | ||
| if has_host_option: |
There was a problem hiding this comment.
Verify host override is propagated, not just declared
This test claims to ensure --host can be explicitly overridden, but it only checks that a typer.Option with --host/-H exists. A regression where studio_default still defines the option but ignores the parsed value (e.g., hard-codes 127.0.0.1 when launching) would still pass here, so the intended behavior is not actually protected.
Useful? React with 👍 / 👎.
…guard for PR unslothai#5754 Round 9 reviewer flagged a pile of handoff asymmetries: every GPU-owning lifecycle change (training, export, chat, images) needed its own bespoke unload sequence and they had drifted out of sync. Some skipped llama-server is_active; some missed safetensors loading_models; export and training did not check is_export_active. Backend handoff (P1) * routes/inference.py: new _release_chat_for / _release_export_for helpers. Both treat llama-server as held when is_loaded OR is_active, safetensors as held when active_model_name OR loading_models is non-empty, and export as held when current_checkpoint OR is_export_active. Both helpers run their unloads in worker threads so async routes do not block the event loop. * routes/training.py: replaces its bespoke inline llama / safe / export unload sequence with await _release_chat_for / _release_ export_for. * routes/export.py: same swap for the chat unload chain (export still does NOT call _release_export_for on itself). * routes/inference.py GGUF + standard chat-load paths: now use _release_export_for to drop a settled export, and the standard path's llama unload now also handles is_active=True (round 9 review #8). Backend reject-on-active export (P1 #5) * routes/inference.py: new _raise_if_export_active. Symmetric with _raise_if_training_active: a long-running export is refused with HTTP 409 instead of being silently killed when /images/load or /load arrives. Diffusion / images load and both chat-load paths call it. * core/inference/diffusion.py _release_other_gpu_owners_for_ diffusion: no longer tears down an in-flight export job. Only drops a SETTLED export checkpoint (current_checkpoint populated, is_export_active False). Round 9 review #5 -- the previous behavior could terminate an in-flight export and leave a partial output artifact. Token leak via logger.exception (P1 #6) * core/inference/diffusion.py: load-failure logging now uses logger.error(..., exc_msg) with the already-scrubbed string and exc_info=False. logger.exception() with the raw Exception would expose any hf_... token that diffusers / huggingface_hub embedded in the message or traceback locals, defeating the earlier in-flight scrub. Dependency pinning (P1 #11) * pyproject.toml: huggingfacenotorch optional extra now pins diffusers>=0.37.0. Previously the floor was only set in studio/backend/requirements/no-torch-runtime.txt, so a normal pip install would resolve diffusers 0.36.0 (no Flux2KleinPipeline) and the default curated FLUX.2 klein Images model would fail at runtime. Cache-delete exact match (P1 #14) * routes/models.py /delete-cached: llama.cpp and safetensors guards now match on exact repo-id (case-insensitive) instead of prefix. Diffusion guard already does this; the chat guards were the remaining surface where loading org/model-v2 blocked deleting org/model.
P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.
P1 #1: ``_release_safetensors_chat_for`` now re-reads ``active_model_name`` and ``loading_models`` after each unload AND runs a final sweep against the initial owned-name set. The previous helper trusted ``unload_model() -> True`` even though the orchestrator can respond ``unloaded`` while still holding weights or a concurrent ``load`` can repopulate the tracker between calls. Per-name and global post-state mismatches now raise HTTP 503 so the caller retries. P1 #2: same post-state guarantee inside ``_release_chat_backend_for_diffusion`` for direct backend callers. ``DiffusionBackend.load_model`` now raises RuntimeError when the safetensors tracker still owns a previously-resident name after the unload, matching the route-level helper. The route layer's existing classifier maps the new wording to HTTP 503. P1 #3: ``DiffusionBackend.load_model`` now preflights the full diffusers repo (or explicit GGUF ``base_repo``) via ``hf_hub_download(filename="model_index.json")`` BEFORE the chat / export unload runs. The GGUF path was already covered by the existing ``hf_hub_download(gguf_filename)`` round-trip; the full-repo path used to skip validation and let a typo / private / gated repo only surface inside ``from_pretrained`` AFTER the user's chat model was already dropped. Local paths are checked structurally (must be a directory containing ``model_index.json``) so we do not network-round-trip for an on-disk miss. Error messages route through ``_display_repo_id`` so an absolute filesystem path does not leak the operator's layout. P1 #6: ``/api/inference/unload`` (the direct chat unload endpoint) now treats ``unload_model() -> False`` AND a leftover state (``is_loaded`` / ``is_active`` / ``loading_model_identifier`` for GGUF, ``active_model_name`` / ``loading_models`` for safetensors) as 503 instead of unconditionally responding ``status="unloaded"``. The UI used to show the model as gone while the backend still owned VRAM. P2 #7: extended the /images/load RuntimeError -> HTTPException marker list with ``still active or loading after unload`` and ``still loading after unload``. Round 18 introduced these exact phrasings on the backend side; without the extension a retryable unload failure was returning HTTP 400 to the user instead of 503. P2 #8: removed the unused ``unsloth_backend = get_inference_backend()`` eager construction in the GGUF chat-load branch. Eager construction made the GGUF-only path needlessly fail or pay startup cost when the safetensors backend was unavailable / lazy; ``_release_safetensors_chat_for`` already handles that case as a no-op. All 85 diffusion-relevant + 98 related backend tests pass locally.
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.
P1 #1: ``_release_safetensors_chat_for`` now re-reads ``active_model_name`` and ``loading_models`` after each unload AND runs a final sweep against the initial owned-name set. The previous helper trusted ``unload_model() -> True`` even though the orchestrator can respond ``unloaded`` while still holding weights or a concurrent ``load`` can repopulate the tracker between calls. Per-name and global post-state mismatches now raise HTTP 503 so the caller retries. P1 #2: same post-state guarantee inside ``_release_chat_backend_for_diffusion`` for direct backend callers. ``DiffusionBackend.load_model`` now raises RuntimeError when the safetensors tracker still owns a previously-resident name after the unload, matching the route-level helper. The route layer's existing classifier maps the new wording to HTTP 503. P1 #3: ``DiffusionBackend.load_model`` now preflights the full diffusers repo (or explicit GGUF ``base_repo``) via ``hf_hub_download(filename="model_index.json")`` BEFORE the chat / export unload runs. The GGUF path was already covered by the existing ``hf_hub_download(gguf_filename)`` round-trip; the full-repo path used to skip validation and let a typo / private / gated repo only surface inside ``from_pretrained`` AFTER the user's chat model was already dropped. Local paths are checked structurally (must be a directory containing ``model_index.json``) so we do not network-round-trip for an on-disk miss. Error messages route through ``_display_repo_id`` so an absolute filesystem path does not leak the operator's layout. P1 #6: ``/api/inference/unload`` (the direct chat unload endpoint) now treats ``unload_model() -> False`` AND a leftover state (``is_loaded`` / ``is_active`` / ``loading_model_identifier`` for GGUF, ``active_model_name`` / ``loading_models`` for safetensors) as 503 instead of unconditionally responding ``status="unloaded"``. The UI used to show the model as gone while the backend still owned VRAM. P2 #7: extended the /images/load RuntimeError -> HTTPException marker list with ``still active or loading after unload`` and ``still loading after unload``. Round 18 introduced these exact phrasings on the backend side; without the extension a retryable unload failure was returning HTTP 400 to the user instead of 503. P2 #8: removed the unused ``unsloth_backend = get_inference_backend()`` eager construction in the GGUF chat-load branch. Eager construction made the GGUF-only path needlessly fail or pay startup cost when the safetensors backend was unavailable / lazy; ``_release_safetensors_chat_for`` already handles that case as a no-op. All 85 diffusion-relevant + 98 related backend tests pass locally.
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
Two universal-consensus round-31 reviewer findings. Concurrency: /images/load was leaking the public-load pending counter on any pre-finally HTTPException (round 31 P1 #1, 11/12 votes). _raise_if_helper_advisor_busy("diffusion") published the counter, then _resolve_diffusion_repo_for_request ran outside the clearing try/finally. A request like repo_id="/tmp/model" with no native_path_lease returned 400 and left public_load_pending() true until process restart, permanently blocking AI Assist. Fix mirrors the training / export pattern: track diffusion_load_window_published in an outer try, publish the flag right after the helper-busy check succeeds, and clear in an outer finally that only fires when the flag is set. This also closes round 31 P1 #6: a second request's failure can no longer decrement a still-active first request's counter, because the second request has not yet flipped its own publish flag. Security: _looks_like_local_diffusion_path missed cwd-relative directories (round 31 P1 #2, 8/12 votes). DiffusionBackend. load_model accepts repo_id="exports/my-flux" as a local directory via Path(repo_id).expanduser().is_dir(), but the detector only flagged values starting with /, ~, ./, ../, backslash, or absolute. Tightened the detector to also reject: * weight-file suffixes (.gguf / .safetensors / .bin / .pt / .pth) * non-2-segment values (`owner`, `a/b/c`, `owner/`, `/repo`, `//`) * 2-segment values whose parts are `.` or `..` * 2-segment values that actually resolve to an existing local path under backend CWD (last-resort exists() probe). The existence probe is a minor side-channel for an already- authenticated caller, accepted in exchange for closing the silent bypass of the new lease boundary. Valid Hub ids like unsloth/FLUX.2-klein-base-4B-GGUF, microsoft/Phi-3.5-mini-instruct still pass through unchanged. Skipped (consistent with prior rounds): * R31 P1 #3 (Tauri / native lease enum missing `load-diffusion-model` op): architectural surface; defer until the Images page actually surfaces a local-path picker. * R31 P1 #4-#5, #8: studio.txt / constraints.txt / pyproject hub pins. Live B200 install path with huggingface_hub==0.36.2, transformers==4.57.6, diffusers==0.37.1 imports Flux2KleinPipeline cleanly. The is_offline_mode import error only triggers when transformers 5.x is paired with hub 0.x, which the constraints pin prevents. * R31 P1 #7 (find_spec vs real import): a full transformers import at module load breaks tests that stub huggingface_hub; find_spec is the existing tradeoff. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor).
Two round-33 reviewer findings: hub-floor consistency and the multipart upload filename validator gap. Dependencies: reverted the round-26 huggingface_hub>=1.3.0 floor in no-torch-runtime.txt and pyproject.toml (round 33 P1 #1-#5, 4/12 vote consensus). studio.txt forces huggingface_hub==0.36.2 to match the transformers==4.57.6 pin in extras-no-deps.txt, so the 1.3.0 floor was internally inconsistent. Reviewers reproduced the resolver conflict on a fresh install. Empirical justification (re-verified on the live B200 host before the revert): huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation. transformers 4.57.6 carries its own transformers.utils.hub.is_offline_mode and does not actually need huggingface_hub.is_offline_mode at import time. The original bump was guarding against the (never-realised) transformers 5.x path, which extras-no-deps explicitly pins away. Validation: multipart /seed/upload-unstructured-file now applies the same _no_control_chars and _reject_embedded_hf_token checks to file.filename that SeedInspectUploadRequest.filename already applies in the JSON variant (round 33 P1 #7). The filename is reflected back to the client, persisted in the per-file meta JSON, and echoed by error responses, so the JSON-side hardening must not be asymmetric with the multipart path. Skipped (consistent with prior rounds): * Find_spec vs full import (R33 P1 #6): preserves test compatibility with the huggingface_hub stub fixture. * React hooks set-state-in-effect lint (R33 P1 #8): codebase has 146 pre-existing violations of the same rule; studio-frontend-ci does not gate on lint. * Direct DiffusionBackend.load_model bypass (R33 P1 #9): the route is the only production entry point, and the backend helper now publishes its own diffusion-backend pending tag (round 32 P1 #3). Direct-caller hardening would require duplicating the lease check into load_model itself, which is out of scope for the route-layer security boundary. * One-segment Hub IDs (R33 P2 #10): strict 2-segment Hub id check is intentional; one-segment names are not valid Hub ids. * Cwd-relative shadow of Hub IDs (R33 P2 #11): documented side-channel tradeoff accepted in round 31 commit msg. 97 targeted backend tests pass.
Staging mirror of unslothai#4864
Original PR: unslothai#4864
Author: @Bedrovelsen
This is a staging copy for review and editing. Once finalized, changes will be pushed back to the original PR.
Original description
Closes unslothai#4684.
Previously Unsloth Studio bound to 0.0.0.0 (all interfaces) by default and the installer silently auto-started a server at the end of install, contradicting the documented "privacy first / 100% offline and locally" guarantee and exposing the service on the network without user consent.
Changes:
unsloth studio(no -H flag); note that -H 0.0.0.0 is available for cloud/network useUsers who need all-interface binding (cloud VMs, LAN sharing) can still pass -H 0.0.0.0 explicitly. No other logic was changed: _is_port_free(), startup_banner, and health-check paths all already handle 127.0.0.1 correctly.
Tests (TDD — written before implementation):
https://claude.ai/code/session_012umxRmBdeDV5U7Xhm1utu6
This PR contains test changes only (3 files). Code changes are in a separate PR.
Test files:
studio/backend/tests/test_host_defaults.pytests/sh/test_install_host_defaults.shtests/studio/test_cli_studio_defaults.py