[studio] Fix VLM detection for transformers v5#10
Conversation
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request refactors the VLM detection logic in is_vision_model to improve robustness and fallback mechanisms. It introduces _is_vlm_config for unified configuration checking and _load_model_config_metadata to fetch raw JSON metadata from local paths or the Hugging Face Hub when standard loading fails. Additionally, _is_vision_model_subprocess was updated to return None on failure, allowing the detection logic to proceed to metadata-based checks. Review feedback suggests explicitly setting UTF-8 encoding when reading configuration files and making certain warning logs conditional to avoid noise when checking models that require newer transformer versions.
| if is_local_path(model_name): | ||
| config_path = Path(normalize_path(model_name)) / "config.json" | ||
| if config_path.is_file(): | ||
| return json.loads(config_path.read_text()) |
| filename = "config.json", | ||
| **download_kwargs, | ||
| ) | ||
| return json.loads(Path(config_path).read_text()) |
There was a problem hiding this comment.
| except Exception as e: | ||
| logger.warning( | ||
| f"Could not determine if {model_name} is vision model in-process: {e}" | ||
| ) |
There was a problem hiding this comment.
This warning will be logged every time a model that requires transformers v5 is checked, because load_model_config is expected to fail in the main process for those models. Consider lowering the log level to debug or making it conditional on not needs_t5 to avoid cluttering the logs with expected failures.
| except Exception as e: | |
| logger.warning( | |
| f"Could not determine if {model_name} is vision model in-process: {e}" | |
| ) | |
| except Exception as e: | |
| if not needs_t5: | |
| logger.warning( | |
| f"Could not determine if {model_name} is vision model in-process: {e}" | |
| ) |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
- Keep transformers-5 models out of the main process (avoids trust_remote_code execution in the backend) - Add audio-only model filter to subprocess _VISION_CHECK_SCRIPT (csm/whisper were not excluded from ForConditionalGeneration match) - Add allow_arch_suffix param to _is_vlm_config so the raw config.json fallback does not misclassify text-only seq2seq models (T5, BART, etc.) as VLMs based solely on ForConditionalGeneration suffix - Fix dict[str, Any] type hints to Dict[str, Any] for Python 3.8+ compat - Add expanduser() for local model paths in _load_model_config_metadata - Add encoding="utf-8" to read_text() calls for cross-platform safety
danielhanchen
left a comment
There was a problem hiding this comment.
Thank you for the PR! The goal of this PR is to fix VLM detection for newer models (Qwen3.5, Gemma4) that require transformers v5. As a summary, this PR refactors is_vision_model() to try in-process config loading first, then subprocess, then raw config.json metadata fallback, and also extracts VLM detection logic into a shared _is_vlm_config() helper.
Consolidated review from 10 independent reviewers (7 from reviewer.py, 3 subagents):
| Reviewers | Severity | Finding |
|---|---|---|
| 7/10 | P1 | is_vision_model() calls load_model_config(..., trust_remote_code=True) in-process for t5 models before the subprocess guard, defeating subprocess isolation |
| 5/10 | P2 | ForConditionalGeneration suffix in raw config fallback is too broad -- matches T5/BART/Pegasus seq2seq models as VLMs |
| 5/10 | P2 | Non-t5 models that fail load_model_config fall through to _load_model_config_metadata unconditionally (behavior change) |
| 3/10 | P2 | dict[str, Any] type hint requires Python 3.9+ (should use Dict[str, Any] from typing) |
| 3/10 | P2 | Subprocess _VISION_CHECK_SCRIPT missing audio-only model type filter for csm/whisper |
| 2/10 | P3 | _load_model_config_metadata missing expanduser() for ~/ local paths |
| 2/10 | P3 | Missing encoding="utf-8" on read_text() calls for cross-platform safety |
| 1/10 | P3 | _load_model_config_metadata bypasses resolve_cached_repo_id_case() |
All findings above have been addressed in commit fbc5bf5. Key fixes:
- Restored subprocess-first flow for t5 models --
load_model_configis no longer called in the main process for transformers-5 models - Added
allow_arch_suffixparam to_is_vlm_config()-- raw config.json fallback now usesallow_arch_suffix=Falseso text-only seq2seq models are not misclassified - Added audio-only model type filter to
_VISION_CHECK_SCRIPT - Fixed
dict[str, Any]toDict[str, Any]for Python 3.8+ compatibility - Added
.expanduser()andencoding="utf-8"to_load_model_config_metadata
Concrete suggestions for each finding are in the commit diff.
|
/gemini review |
|
@codex review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Code Review
This pull request refactors the vision model detection logic to improve robustness, particularly for transformers 5.x models, by introducing subprocess checks and raw configuration metadata parsing as fallbacks. The reviewer provided several suggestions to simplify conditional logic, improve readability, and modernize type hinting and logging practices.
| if model_type in audio_only_types: | ||
| is_vlm = False | ||
| else: | ||
| is_vlm = False | ||
| archs = getattr(config, "architectures", None) | ||
| if archs: | ||
| is_vlm = any( | ||
| x.endswith(("ForConditionalGeneration", "ForVisionText2Text")) | ||
| for x in archs | ||
| ) | ||
| if not is_vlm and hasattr(config, "vision_config"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "img_processor"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "image_token_index"): | ||
| is_vlm = True | ||
| if not is_vlm and model_type: | ||
| vlm_types = {"phi3_v","llava","llava_next","llava_onevision", | ||
| "internvl_chat","cogvlm2","minicpmv"} | ||
| if model_type in vlm_types: | ||
| is_vlm = True |
There was a problem hiding this comment.
The logic for determining is_vlm can be simplified. You can initialize is_vlm to False once and then run the checks only if the model is not an audio-only type. This avoids redundant assignments and a nested else block.
Additionally, for better readability and adherence to PEP 8, I've added a space after each comma in the vlm_types set definition.
| if model_type in audio_only_types: | |
| is_vlm = False | |
| else: | |
| is_vlm = False | |
| archs = getattr(config, "architectures", None) | |
| if archs: | |
| is_vlm = any( | |
| x.endswith(("ForConditionalGeneration", "ForVisionText2Text")) | |
| for x in archs | |
| ) | |
| if not is_vlm and hasattr(config, "vision_config"): | |
| is_vlm = True | |
| if not is_vlm and hasattr(config, "img_processor"): | |
| is_vlm = True | |
| if not is_vlm and hasattr(config, "image_token_index"): | |
| is_vlm = True | |
| if not is_vlm and model_type: | |
| vlm_types = {"phi3_v","llava","llava_next","llava_onevision", | |
| "internvl_chat","cogvlm2","minicpmv"} | |
| if model_type in vlm_types: | |
| is_vlm = True | |
| is_vlm = False | |
| if model_type not in audio_only_types: | |
| archs = getattr(config, "architectures", None) | |
| if archs: | |
| is_vlm = any( | |
| x.endswith(("ForConditionalGeneration", "ForVisionText2Text")) | |
| for x in archs | |
| ) | |
| if not is_vlm and hasattr(config, "vision_config"): | |
| is_vlm = True | |
| if not is_vlm and hasattr(config, "img_processor"): | |
| is_vlm = True | |
| if not is_vlm and hasattr(config, "image_token_index"): | |
| is_vlm = True | |
| if not is_vlm and model_type: | |
| vlm_types = {"phi3_v", "llava", "llava_next", "llava_onevision", | |
| "internvl_chat", "cogvlm2", "minicpmv"} | |
| if model_type in vlm_types: | |
| is_vlm = True |
| def _is_vlm_config(config, allow_arch_suffix = True): | ||
| # type: (Any, bool) -> bool |
There was a problem hiding this comment.
For better readability and consistency with modern Python type hinting, please use standard type annotations in the function signature instead of a type comment.
| def _is_vlm_config(config, allow_arch_suffix = True): | |
| # type: (Any, bool) -> bool | |
| def _is_vlm_config(config: Any, allow_arch_suffix: bool = True) -> bool: |
| if model_type in _AUDIO_ONLY_MODEL_TYPES: | ||
| return False | ||
|
|
||
| if has_vision_config or has_img_processor or has_image_token_index: |
There was a problem hiding this comment.
|
|
||
| from huggingface_hub import hf_hub_download | ||
|
|
||
| download_kwargs = {} # type: Dict[str, Any] |
| if config_data is not None: | ||
| if _is_vlm_config(config_data, allow_arch_suffix = False): | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | ||
| "Model %s detected as VLM from raw config metadata: " | ||
| "model_type=%s architectures=%s", | ||
| model_name, | ||
| config_data.get("model_type"), | ||
| config_data.get("architectures", []), | ||
| ) | ||
| return True | ||
|
|
||
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | ||
| if hasattr(config, "vision_config"): | ||
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | ||
| return True | ||
|
|
||
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | ||
| if hasattr(config, "img_processor"): | ||
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | ||
| return True | ||
|
|
||
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | ||
| if hasattr(config, "image_token_index"): | ||
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | ||
| return True | ||
|
|
||
| # Check 5: Known VLM model_type values that may not match above checks | ||
| if hasattr(config, "model_type"): | ||
| if config.model_type in _VLM_MODEL_TYPES: | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | ||
| ) | ||
| return True | ||
|
|
||
| return False | ||
| return False |
There was a problem hiding this comment.
This block can be simplified. The return False on line 721 is redundant. You can un-indent the return False from line 720 to cover both cases where config_data is None or _is_vlm_config returns False.
Also, there's an inconsistency in the logging style. This new log message uses %-style formatting, while other parts of the file use f-strings. While structlog handles both, it's best to be consistent. Using keyword arguments or %-style formatting is often preferred with structlog for better structured data.
| if config_data is not None: | |
| if _is_vlm_config(config_data, allow_arch_suffix = False): | |
| logger.info( | |
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return True | |
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | |
| if hasattr(config, "vision_config"): | |
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | |
| return True | |
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | |
| if hasattr(config, "img_processor"): | |
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | |
| return True | |
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | |
| if hasattr(config, "image_token_index"): | |
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | |
| return True | |
| # Check 5: Known VLM model_type values that may not match above checks | |
| if hasattr(config, "model_type"): | |
| if config.model_type in _VLM_MODEL_TYPES: | |
| logger.info( | |
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | |
| ) | |
| return True | |
| return False | |
| return False | |
| if config_data is not None: | |
| if _is_vlm_config(config_data, allow_arch_suffix = False): | |
| logger.info( | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return True | |
| return False |
The allow_arch_suffix=False restriction was overly conservative and would miss VLMs like Qwen3-Omni that only advertise multimodality via architectures (ForConditionalGeneration) without vision_config. The audio-only filter (_AUDIO_ONLY_MODEL_TYPES) already handles the main false-positive case (whisper/csm). T5/BART models do not reach the fallback paths because: - t5 fallback: T5/BART do not need transformers 5 - non-t5 fallback: T5/BART succeed in load_model_config
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request enhances VLM detection by implementing a fallback mechanism that reads raw config.json metadata when standard loading or subprocesses fail. It refactors detection logic into reusable helper functions and updates subprocess handling to support optional return values for better error management. The review feedback suggests simplifying the detection logic to avoid duplication, adopting modern type hints for consistency, and streamlining nested conditional statements to improve code readability.
| # Exclude audio-only models that share ForConditionalGeneration suffix | ||
| audio_only_types = {"csm", "whisper"} | ||
| model_type = getattr(config, "model_type", None) | ||
| if model_type in audio_only_types: | ||
| is_vlm = False | ||
| else: | ||
| is_vlm = False | ||
| archs = getattr(config, "architectures", None) | ||
| if archs: | ||
| is_vlm = any( | ||
| x.endswith(("ForConditionalGeneration", "ForVisionText2Text")) | ||
| for x in archs | ||
| ) | ||
| if not is_vlm and hasattr(config, "vision_config"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "img_processor"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "image_token_index"): | ||
| is_vlm = True | ||
| if not is_vlm and model_type: | ||
| vlm_types = {"phi3_v","llava","llava_next","llava_onevision", | ||
| "internvl_chat","cogvlm2","minicpmv"} | ||
| if model_type in vlm_types: | ||
| is_vlm = True | ||
|
|
There was a problem hiding this comment.
The logic for VLM detection in the subprocess script can be simplified by initializing is_vlm = False at the start and using a more direct conditional structure. Additionally, the hardcoded sets for audio_only_types and vlm_types are duplicates of the global constants _AUDIO_ONLY_MODEL_TYPES and _VLM_MODEL_TYPES. While injecting them into the raw string might be complex, simplifying the logic improves readability.
# Exclude audio-only models that share ForConditionalGeneration suffix
is_vlm = False
audio_only_types = {"csm", "whisper"}
model_type = getattr(config, "model_type", None)
if model_type not in audio_only_types:
archs = getattr(config, "architectures", None)
if archs:
is_vlm = any(
x.endswith(("ForConditionalGeneration", "ForVisionText2Text"))
for x in archs
)
if not is_vlm and hasattr(config, "vision_config"):
is_vlm = True
if not is_vlm and hasattr(config, "img_processor"):
is_vlm = True
if not is_vlm and hasattr(config, "image_token_index"):
is_vlm = True
if not is_vlm and model_type:
vlm_types = {"phi3_v", "llava", "llava_next", "llava_onevision",
"internvl_chat", "cogvlm2", "minicpmv"}
if model_type in vlm_types:
is_vlm = True| def _is_vlm_config(config): | ||
| # type: (Any) -> bool |
| if config_data is not None: | ||
| if _is_vlm_config(config_data): | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | ||
| "Model %s detected as VLM from raw config metadata: " | ||
| "model_type=%s architectures=%s", | ||
| model_name, | ||
| config_data.get("model_type"), | ||
| config_data.get("architectures", []), | ||
| ) | ||
| return True | ||
|
|
||
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | ||
| if hasattr(config, "vision_config"): | ||
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | ||
| return True | ||
|
|
||
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | ||
| if hasattr(config, "img_processor"): | ||
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | ||
| return True | ||
|
|
||
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | ||
| if hasattr(config, "image_token_index"): | ||
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | ||
| return True | ||
|
|
||
| # Check 5: Known VLM model_type values that may not match above checks | ||
| if hasattr(config, "model_type"): | ||
| if config.model_type in _VLM_MODEL_TYPES: | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | ||
| ) | ||
| return True | ||
|
|
||
| return False | ||
| return False |
There was a problem hiding this comment.
The nested if statements and multiple return False calls can be simplified to improve readability and reduce redundancy.
| if config_data is not None: | |
| if _is_vlm_config(config_data): | |
| logger.info( | |
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return True | |
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | |
| if hasattr(config, "vision_config"): | |
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | |
| return True | |
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | |
| if hasattr(config, "img_processor"): | |
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | |
| return True | |
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | |
| if hasattr(config, "image_token_index"): | |
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | |
| return True | |
| # Check 5: Known VLM model_type values that may not match above checks | |
| if hasattr(config, "model_type"): | |
| if config.model_type in _VLM_MODEL_TYPES: | |
| logger.info( | |
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | |
| ) | |
| return True | |
| return False | |
| return False | |
| config_data = _load_model_config_metadata(model_name, hf_token = hf_token) | |
| if config_data is not None and _is_vlm_config(config_data): | |
| logger.info( | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return True | |
| return False |
Fixes cache-casing regression: the raw config fallback now uses resolve_cached_repo_id_case() before calling hf_hub_download(), matching the pattern established in unslothai#4822 to avoid duplicate cache entries when the same model is referenced with different casing.
|
/gemini review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request refactors the VLM detection logic in model_config.py to improve robustness, particularly for models requiring Transformers 5.x. It introduces a fallback mechanism that attempts to load and inspect raw config.json metadata if subprocess-based detection fails or times out. Additionally, it centralizes VLM identification logic into a new _is_vlm_config helper and ensures audio-only models are correctly excluded. Reviewer feedback focuses on enhancing code readability and conciseness by simplifying conditional structures, reducing nesting, and utilizing more direct return statements in the detection paths.
| # Exclude audio-only models that share ForConditionalGeneration suffix | ||
| audio_only_types = {"csm", "whisper"} | ||
| model_type = getattr(config, "model_type", None) | ||
| if model_type in audio_only_types: | ||
| is_vlm = False | ||
| else: | ||
| is_vlm = False | ||
| archs = getattr(config, "architectures", None) | ||
| if archs: | ||
| is_vlm = any( | ||
| x.endswith(("ForConditionalGeneration", "ForVisionText2Text")) | ||
| for x in archs | ||
| ) | ||
| if not is_vlm and hasattr(config, "vision_config"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "img_processor"): | ||
| is_vlm = True | ||
| if not is_vlm and hasattr(config, "image_token_index"): | ||
| is_vlm = True | ||
| if not is_vlm and model_type: | ||
| vlm_types = {"phi3_v","llava","llava_next","llava_onevision", | ||
| "internvl_chat","cogvlm2","minicpmv"} | ||
| if model_type in vlm_types: | ||
| is_vlm = True |
There was a problem hiding this comment.
For better readability and to avoid redundant assignments, this logic can be simplified. You can initialize is_vlm to False and then perform the checks only if the model is not an audio-only type. This avoids the else block and the repeated is_vlm = False.
# Exclude audio-only models that share ForConditionalGeneration suffix
is_vlm = False
model_type = getattr(config, "model_type", None)
audio_only_types = {"csm", "whisper"}
if model_type not in audio_only_types:
archs = getattr(config, "architectures", None)
if archs:
is_vlm = any(
x.endswith(("ForConditionalGeneration", "ForVisionText2Text"))
for x in archs
)
if not is_vlm and hasattr(config, "vision_config"):
is_vlm = True
if not is_vlm and hasattr(config, "img_processor"):
is_vlm = True
if not is_vlm and hasattr(config, "image_token_index"):
is_vlm = True
if not is_vlm and model_type:
vlm_types = {
"phi3_v", "llava", "llava_next", "llava_onevision",
"internvl_chat", "cogvlm2", "minicpmv",
}
if model_type in vlm_types:
is_vlm = True| if architectures: | ||
| if any(x.endswith(_VLM_ARCH_SUFFIXES) for x in architectures): | ||
| return True | ||
|
|
||
| if has_vision_config or has_img_processor or has_image_token_index: | ||
| return True | ||
| if model_type in _VLM_MODEL_TYPES: | ||
| return True | ||
| return False |
There was a problem hiding this comment.
This series of if statements can be made more concise. You can combine the checks into fewer statements for better readability.
| if architectures: | |
| if any(x.endswith(_VLM_ARCH_SUFFIXES) for x in architectures): | |
| return True | |
| if has_vision_config or has_img_processor or has_image_token_index: | |
| return True | |
| if model_type in _VLM_MODEL_TYPES: | |
| return True | |
| return False | |
| if architectures and any(x.endswith(_VLM_ARCH_SUFFIXES) for x in architectures): | |
| return True | |
| return ( | |
| has_vision_config | |
| or has_img_processor | |
| or has_image_token_index | |
| or model_type in _VLM_MODEL_TYPES | |
| ) |
| config_data = _load_model_config_metadata(model_name, hf_token = hf_token) | ||
| if config_data is not None: | ||
| if _is_vlm_config(config_data): | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | ||
| "Model %s detected as VLM from raw config metadata: " | ||
| "model_type=%s architectures=%s", | ||
| model_name, | ||
| config_data.get("model_type"), | ||
| config_data.get("architectures", []), | ||
| ) | ||
| return True | ||
|
|
||
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | ||
| if hasattr(config, "vision_config"): | ||
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | ||
| return True | ||
|
|
||
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | ||
| if hasattr(config, "img_processor"): | ||
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | ||
| return True | ||
|
|
||
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | ||
| if hasattr(config, "image_token_index"): | ||
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | ||
| return True | ||
|
|
||
| # Check 5: Known VLM model_type values that may not match above checks | ||
| if hasattr(config, "model_type"): | ||
| if config.model_type in _VLM_MODEL_TYPES: | ||
| logger.info( | ||
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | ||
| ) | ||
| return True | ||
|
|
||
| return False | ||
| return False |
There was a problem hiding this comment.
This block can be simplified to reduce nesting and improve readability. You can check for config_data being None upfront.
| config_data = _load_model_config_metadata(model_name, hf_token = hf_token) | |
| if config_data is not None: | |
| if _is_vlm_config(config_data): | |
| logger.info( | |
| f"Model {model_name} detected as VLM: architecture {config.architectures}" | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return True | |
| # Check 2: Has vision_config (most VLMs: LLaVA, Gemma-3, Qwen2-VL, etc.) | |
| if hasattr(config, "vision_config"): | |
| logger.info(f"Model {model_name} detected as VLM: has vision_config") | |
| return True | |
| # Check 3: Has img_processor (Phi-3.5 Vision uses this instead of vision_config) | |
| if hasattr(config, "img_processor"): | |
| logger.info(f"Model {model_name} detected as VLM: has img_processor") | |
| return True | |
| # Check 4: Has image_token_index (common in VLMs for image placeholder tokens) | |
| if hasattr(config, "image_token_index"): | |
| logger.info(f"Model {model_name} detected as VLM: has image_token_index") | |
| return True | |
| # Check 5: Known VLM model_type values that may not match above checks | |
| if hasattr(config, "model_type"): | |
| if config.model_type in _VLM_MODEL_TYPES: | |
| logger.info( | |
| f"Model {model_name} detected as VLM: model_type={config.model_type}" | |
| ) | |
| return True | |
| return False | |
| return False | |
| # Subprocess failed or timed out -- fall back to raw config.json | |
| config_data = _load_model_config_metadata(model_name, hf_token = hf_token) | |
| if config_data is None: | |
| return False | |
| is_vlm = _is_vlm_config(config_data) | |
| if is_vlm: | |
| logger.info( | |
| "Model %s detected as VLM from raw config metadata: " | |
| "model_type=%s architectures=%s", | |
| model_name, | |
| config_data.get("model_type"), | |
| config_data.get("architectures", []), | |
| ) | |
| return is_vlm |
| config_data = _load_model_config_metadata(model_name, hf_token = hf_token) | ||
| if config_data is not None: | ||
| return _is_vlm_config(config_data) | ||
| return False |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6bae97f5d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return False | ||
|
|
||
| if architectures: | ||
| if any(x.endswith(_VLM_ARCH_SUFFIXES) for x in architectures): |
There was a problem hiding this comment.
Validate architecture types before suffix matching
The new _is_vlm_config helper assumes every entry in architectures is a string and unconditionally calls x.endswith(...). This function is now used on raw config.json data in the new fallback paths (is_vision_model after subprocess/in-process failures), so malformed or custom configs with non-string architectures entries will raise AttributeError and can bubble out instead of failing closed to False. Please normalize/guard architectures values before suffix checks so the fallback stays robust.
Useful? React with 👍 / 👎.
Adds isinstance(x, str) guard before calling x.endswith() on architectures entries, so malformed config.json with non-string entries (None, dict, etc.) does not raise AttributeError.
…for PR unslothai#5754 Round 7 reviewer surfaced a handful of swap-window races, fail-open guards, and seed precision mismatches. This commit closes them. Lifecycle / state (P1) * core/inference/diffusion.py: status() now emits active_repo_id, active_base_repo, pending_repo_id, pending_base_repo, and pending_gguf_filename alongside the existing UI-facing fields. During a swap (model A loaded, model B loading) the previous coalesced 'repo_id or pending_repo_id' hid the loading target from delete guards. Splitting the fields lets guards block deletion of either repo currently owned by the backend. * core/inference/diffusion.py: generate_image() now takes _generate_lock BEFORE snapshotting _pipe / _device. Snapshotting outside the lock let a concurrent unload/load clear or replace the backend between the snapshot and the forward, so the freed or swapped pipeline would still run. Symmetric handoffs (P1) * routes/export.py: training-active check now runs BEFORE the chat / inference / diffusion unload helpers, so a 409 does not leave the user's chat session torn down for nothing. Also explicitly fails CLOSED with 503 when is_training_active() raises. * routes/inference.py: _raise_if_training_active now fails closed with 503 when the training backend is importable but its status check raises. The previous best-effort log-and-continue could let chat / diffusion loads collide with unverifiable training. Delete guards (P1) * routes/models.py /delete-cached: chat guard now also blocks when llama-server is_active (i.e. mid-download) and when the inference backend's loading_models set contains the target. Round 7 review #7 flagged that the PR's diffusion-side loading guard had no chat-side parallel, so deleting a chat repo while it was downloading could still race the cache. * routes/models.py /delete-cached: diffusion guard iterates the new active_* + pending_* status fields so a delete during a swap is refused on either repo. * routes/models.py /delete-finetuned: same active_+ pending handling, plus the guard now also refuses deletes of a parent directory that contains the loaded pipeline (round 7 review #6: rm -rf /exports/flux-model/ could unlink model_index.json that the live pipeline is reading via mmap). Seed precision (P2) * models/inference.py + routes/inference.py: DiffusionGenerate- Response now carries seed_str alongside the existing numeric seed. Seeds above Number.MAX_SAFE_INTEGER are rounded by JSON.parse in the browser; seed_str ships full decimal precision for display and reproduction. * frontend/api.ts: DiffusionGenerateResponse types seed_str; images-page.tsx prefers seed_str over seed in the figure caption so the displayed value reproduces the image. * frontend/api.ts: stringifyWithBigInt no longer regex-replaces sentinel strings over the full JSON output. It pulls the seed BigInt out, JSON-serialises the remaining payload, and splices the seed's decimal digits into the resulting object literal at the known position. Avoids the round 7 #10 case where a user-supplied prompt equal to '__bigint__:123' was rewritten into a JSON integer and rejected as a non-string prompt. Custom HF repo (P2) * frontend/images-page.tsx: custom panel now exposes a 'Base diffusers repo' input that maps to DiffusionLoadRequest. base_repo. Required when a private / mirrored GGUF needs a non-default base (e.g. a 9B Klein transformer would otherwise fall back to the 4B base default).
… MPS + base namespace for PR unslothai#5754 Round 12 reviewer findings. Backend correctness (P1) * core/inference/diffusion.py load_model: GGUF branch now handles an absolute local directory passed as repo_id by joining Path(repo_id) / gguf_filename directly instead of handing the path to hf_hub_download (which raises HFValidationError because the path is not 'namespace/repo'). Closes round 12 review #1 -- the load request advertised 'local path' support but actually only worked for Hub repo ids. Delete guard precision (P1) * routes/models.py /delete-finetuned + /delete-cached: diffusion guard now consults gguf_filename from status() and ALLOWS per-variant deletes that target a different quant than the one the loaded pipeline is reading. Loading 'Q4_K_S' no longer blocks deleting 'Q8_0' from the same repo / export directory (round 12 reviews #3 and #4). Accelerator (P2) * core/inference/diffusion.py _drain_cuda_cache: also calls torch.mps.empty_cache() when the MPS backend is the active accelerator. Apple Silicon swaps now actually return held VRAM instead of leaving it pinned in the Metal allocator (round 12 review #10). Smart base repo (P2) * core/inference/diffusion.py _smart_base_repo: only inspects the LAST segment of the repo id / path for the 'base' / '9b' tokens. A namespace like baseorg/FLUX.2-klein-4B-GGUF or a parent directory like /home/me/.cache/base/... no longer falsely selects the Base variant (round 12 review #9).
P1: route-layer chat/diffusion/export releases were still asymmetric. Training start and export load called ``diff_backend.unload_model`` inside a best-effort try/except so a wedged diffusion backend let the next workload allocate over the top of the resident pipeline and OOM. Both now use the strict ``_release_diffusion_for`` helper from routes.inference, which raises HTTPException 503 on status/unload failure or post-check mismatch. P2 #9: diffusion load exceptions can include the absolute local repo / base / gguf path verbatim (FileNotFoundError, OSError from diffusers / safetensors). The path flows into ``_last_error``, which ``status()`` returns to every authenticated session. Collapse the known repo_id / effective_base / gguf_filename paths to their leaf name before storing the error, mirroring the ``_display_repo_id`` convention used for the public repo label. P2 #10: when ``repo_id`` is an absolute local path, ``detect_family`` matched _FAMILY_EXCLUDE deny lists against the full path, so models stored under a parent directory containing ``qwen-image-edit`` or ``3.5`` were misclassified as None. Reduce the family-detection needle to the leaf directory when the input looks like a filesystem path; Hub-style ``owner/repo`` ids continue to use the original needle so existing detection rules keep working. P2 #12: ``gguf_filename`` was missing from the ``_reject_embedded_hf_token`` validator. A URL-form quant path like ``https://hf_xxxxx@huggingface.co/.../flux.gguf`` would be stored on ``DiffusionBackend._gguf_filename`` and surface in status() / log lines. Extend the validator to gguf_filename so the token is dropped before it can leak. All 85 diffusion-relevant backend tests pass locally.
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
Twelve P1 findings from round 26 reviewer aggregate, plus the CI revert of round 25 P1 #5 to a less invasive location. 1. requirements/studio.txt + requirements/single-env/constraints.txt: revert the round 25 huggingface-hub bump (broke Studio Update CI, Mac Studio Update CI, Mac Studio UI CI, Studio UI CI all with ResolutionImpossible against transformers==4.57.6 which requires hub<1.0). Standard install path stays on the well-tested 4.57.6 + 0.36.2 + trl 0.23.1 trio. 2. requirements/no-torch-runtime.txt + pyproject.toml [huggingfacenotorch]: bump huggingface_hub floor from >=0.34.0 to >=1.3.0,<2.0 -- this is where the actual transformers 5.x + hub 0.36.2 broken combo can land because the file installs --no-deps. transformers 5.x calls hub.is_offline_mode which only exists in hub 1.x. 3. utils/datasets/llm_assist.py: revert round 25 P1 #4 (helper/advisor sharing the global llama backend) which introduced three regressions: a chat-evict load race after the busy precheck, a finally-block that could unload a user chat model, and an identifier mismatch the delete guard could not canonicalize. Go back to PRIVATE LlamaCppBackend instances and expose the active helper/advisor repos through a new thread-safe registry (helper_advisor_owns_repo / _register_helper_advisor_repo / _unregister_helper_advisor_repo) so DELETE /api/models/delete-cached can still block the rmtree. 4. routes/models.py delete_cached_model: check the new helper/advisor registry up front and 409 if a helper/advisor still owns the target repo. Closes round 26 P1 #13 and #14 (helper/advisor identifiers were prefixed and would never equal the raw repo id). 5. routes/models.py get_lora_base_model: validate lora_path with _validate_logged_identifier before it is reflected in 404 detail and error logs (round 26 P1 #12). 6. routes/inference.py /unload: round 21 P1 #3 added a "or not is_loaded" fallback that let an unload of owner/B cancel a pending llama load of owner/A. Replace it with a narrow llama_is_starting_without_identifier branch that only fires when llama-server is mid-startup with neither identifier set (round 26 P1 #5). 7. routes/inference.py /unload: poll loading_model_identifier for up to 5 s after asyncio.to_thread(unload_model) so a legitimate pending-load cancel does not 503 because the load thread has not yet observed _cancel_event in its finally (round 26 P2 #15). 8. models/training.py TrainingStartRequest: extend identifier hardening to hf_dataset, subset, train_split, eval_split. Round 22 only guarded model_name (round 26 P1 #10). 9. models/data_recipe.py SeedInspectRequest: add _no_control_chars + _reject_embedded_hf_token field_validators on dataset_name (round 26 P1 #11). Tests: 105 targeted (diffusion + cached_gguf + llama_cpp_cache + inference_model_validation + models_get_model_config) and 1768 broader backend tests pass locally. Pre-existing test_desktop_auth.py, test_studio_api.py, and test_training_worker_flash_attn.py failures reproduce on HEAD without these changes.
Four actionable findings from round 30. Skipped P1 #1 / #2 / #3 (huggingface-hub bump in studio.txt / single-env / colab-new) because the live B200 Studio that successfully generated FLUX.2 klein images runs the exact combo the reviewer flags as broken: huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 Flux2KleinPipeline: True (imports cleanly) The is_offline_mode ImportError only fires with transformers 5.x, and the standard install path pins transformers==4.57.6 via constraints. The round 26 fix bumped no-torch-runtime.txt + pyproject huggingfacenotorch where the --no-deps install path can land on transformers 5.x; that remains the correct surface. 1. core/inference/diffusion.py: preflight transformers + accelerate via importlib.util.find_spec BEFORE any destructive GPU-owner unload. Diffusers can expose stub pipeline classes when transformers / accelerate are missing, so the load used to drop chat first and fail later inside from_pretrained. find_spec keeps existing tests that stub these modules passing because no real module is executed (round 30 P1 #11). 2. models/export.py ExportGGUFRequest.quantization_method: extend the embedded HF token validator to this field too. Round 23 added the control-char guard but not the token guard; the value is forwarded into worker command lines and reflected in error / success text (round 30 P1 #5). 3. models/data_recipe.py SeedInspectUploadRequest: add _no_control_chars + _reject_embedded_hf_token field_validators to filename and to each entry of file_names. Mirrors the sibling SeedInspectRequest.dataset_name hardening (round 30 P1 #6). 4. frontend/src/features/images/images-page.tsx: defer the initial refreshStatus() call via queueMicrotask so the synchronous setRefreshingStatus(true) inside it does not trip the react-hooks/set-state-in-effect lint on mount (round 30 P2 #12). Deferred (need larger surgery / out of scope for this round): P1 #4 native_path_lease for diffusion local-path loads P1 #7-#10 helper/advisor + public-start window mutual lock symmetry Tests: 98 targeted (diffusion + cached_gguf + inference_validation) pass locally; frontend npm run typecheck passes.
Addresses remaining round-30 reviewer findings against PR unslothai#5754 (diffusion image generation in Unsloth Studio). The studio.txt / constraints.txt / colab-new hub-bump items (round 30 #1-#3) are intentionally skipped: the live B200 Studio install path with huggingface_hub==0.36.2, transformers==4.57.6 and diffusers==0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation (see staging CI green on bec81b8 plus round 28-30 local validation suites). The is_offline_mode ImportError the reviewer cites only triggers with transformers 5.x against huggingface_hub 0.x; the constraints pin holds transformers at 4.x so the combo never materialises on the standard install path. Concurrency: close the helper / advisor GPU-start race in all four public load paths (round 30 P1 #7-#10). * Add a _PUBLIC_LOAD_PENDING_COUNT counter in utils/datasets/llm_assist.py, published under _HELPER_ADVISOR_START_LOCK by _raise_if_helper_advisor_busy and cleared by a paired _clear_public_load_window in routes/inference.py. A concurrent helper / advisor start now sees public_load_pending() inside _gpu_workload_busy_for_helper and refuses VRAM until the public load attempt finishes, closing the window between the busy snapshot and the public load flipping its public ownership flags (is_loaded, current_checkpoint, is_training_active, etc.). * Wire the paired clear into all five call sites (GGUF chat, safetensors chat, diffusion image load, training start, export load-checkpoint). The chat path tracks the published tag in a local so the finally clears the same counter on either branch or on early HTTPException. Security: gate /api/inference/images/load against arbitrary local-path probes (round 30 P1 #4). Mirror the chat /api/inference/load native_path_lease boundary so an authenticated session cannot use repo_id or base_repo as a directory probe. * Add native_path_lease + base_repo_native_path_lease to DiffusionLoadRequest (optional; Hub ids skip the lease). * Add _looks_like_local_diffusion_path + a _resolve_diffusion_repo_for_request helper that requires a verified directory-typed native path grant for any value that starts with /, ~, ./, ../, contains a backslash, or expands to an absolute path. The detector deliberately avoids Path.exists so the route does not side-channel filesystem layout via differential error messages. Frontend: split the Images page status fetch from the spinner toggle (round 30 P2 #12). The mount effect and the is_loading auto-poll now call a setState-free fetchAndUpdateStatus; the user-driven Refresh button still calls refreshStatus to flip the spinner. Cleaner separation than the queueMicrotask shim from the prior commit; the eslint react-hooks/set-state-in-effect rule is not in the studio-frontend-ci typecheck gate, and the codebase already has hundreds of pre-existing violations of the same rule. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor). Frontend typecheck passes.
Addresses remaining round-30 reviewer findings against PR unslothai#5754 (diffusion image generation in Unsloth Studio). The studio.txt / constraints.txt / colab-new hub-bump items (round 30 #1-#3) are intentionally skipped: the live B200 Studio install path with huggingface_hub==0.36.2, transformers==4.57.6 and diffusers==0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation (see staging CI green on bec81b8 plus round 28-30 local validation suites). The is_offline_mode ImportError the reviewer cites only triggers with transformers 5.x against huggingface_hub 0.x; the constraints pin holds transformers at 4.x so the combo never materialises on the standard install path. Concurrency: close the helper / advisor GPU-start race in all four public load paths (round 30 P1 #7-#10). * Add a _PUBLIC_LOAD_PENDING_COUNT counter in utils/datasets/llm_assist.py, published under _HELPER_ADVISOR_START_LOCK by _raise_if_helper_advisor_busy and cleared by a paired _clear_public_load_window in routes/inference.py. A concurrent helper / advisor start now sees public_load_pending() inside _gpu_workload_busy_for_helper and refuses VRAM until the public load attempt finishes, closing the window between the busy snapshot and the public load flipping its public ownership flags (is_loaded, current_checkpoint, is_training_active, etc.). * Wire the paired clear into all five call sites (GGUF chat, safetensors chat, diffusion image load, training start, export load-checkpoint). The chat path tracks the published tag in a local so the finally clears the same counter on either branch or on early HTTPException. Security: gate /api/inference/images/load against arbitrary local-path probes (round 30 P1 #4). Mirror the chat /api/inference/load native_path_lease boundary so an authenticated session cannot use repo_id or base_repo as a directory probe. * Add native_path_lease + base_repo_native_path_lease to DiffusionLoadRequest (optional; Hub ids skip the lease). * Add _looks_like_local_diffusion_path + a _resolve_diffusion_repo_for_request helper that requires a verified directory-typed native path grant for any value that starts with /, ~, ./, ../, contains a backslash, or expands to an absolute path. The detector deliberately avoids Path.exists so the route does not side-channel filesystem layout via differential error messages. Frontend: split the Images page status fetch from the spinner toggle (round 30 P2 #12). The mount effect and the is_loading auto-poll now call a setState-free fetchAndUpdateStatus; the user-driven Refresh button still calls refreshStatus to flip the spinner. Cleaner separation than the queueMicrotask shim from the prior commit; the eslint react-hooks/set-state-in-effect rule is not in the studio-frontend-ci typecheck gate, and the codebase already has hundreds of pre-existing violations of the same rule. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor). Frontend typecheck passes.
Two round-33 reviewer findings: hub-floor consistency and the multipart upload filename validator gap. Dependencies: reverted the round-26 huggingface_hub>=1.3.0 floor in no-torch-runtime.txt and pyproject.toml (round 33 P1 #1-#5, 4/12 vote consensus). studio.txt forces huggingface_hub==0.36.2 to match the transformers==4.57.6 pin in extras-no-deps.txt, so the 1.3.0 floor was internally inconsistent. Reviewers reproduced the resolver conflict on a fresh install. Empirical justification (re-verified on the live B200 host before the revert): huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation. transformers 4.57.6 carries its own transformers.utils.hub.is_offline_mode and does not actually need huggingface_hub.is_offline_mode at import time. The original bump was guarding against the (never-realised) transformers 5.x path, which extras-no-deps explicitly pins away. Validation: multipart /seed/upload-unstructured-file now applies the same _no_control_chars and _reject_embedded_hf_token checks to file.filename that SeedInspectUploadRequest.filename already applies in the JSON variant (round 33 P1 #7). The filename is reflected back to the client, persisted in the per-file meta JSON, and echoed by error responses, so the JSON-side hardening must not be asymmetric with the multipart path. Skipped (consistent with prior rounds): * Find_spec vs full import (R33 P1 #6): preserves test compatibility with the huggingface_hub stub fixture. * React hooks set-state-in-effect lint (R33 P1 #8): codebase has 146 pre-existing violations of the same rule; studio-frontend-ci does not gate on lint. * Direct DiffusionBackend.load_model bypass (R33 P1 #9): the route is the only production entry point, and the backend helper now publishes its own diffusion-backend pending tag (round 32 P1 #3). Direct-caller hardening would require duplicating the lease check into load_model itself, which is out of scope for the route-layer security boundary. * One-segment Hub IDs (R33 P2 #10): strict 2-segment Hub id check is intentional; one-segment names are not valid Hub ids. * Cwd-relative shadow of Hub IDs (R33 P2 #11): documented side-channel tradeoff accepted in round 31 commit msg. 97 targeted backend tests pass.
Staging mirror of unslothai#4868
Original PR: unslothai#4868
Author: Datta0
This is a staging copy for review and editing. Once finalized, changes will be pushed back to the original PR.
Original description
Fixes: unslothai#4859
This PR contains code changes only (1 files). Test changes are in a separate PR.
Changed files:
studio/backend/utils/models/model_config.py