[registry] Add a strict mode to model registration#14933
[registry] Add a strict mode to model registration#14933fzyzcjy merged 3 commits intosgl-project:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
| @@ -19,8 +19,10 @@ class _ModelRegistry: | |||
| # Keyed by model_arch | |||
| models: Dict[str, Union[Type[nn.Module], str]] = field(default_factory=dict) | |||
|
|
|||
| def register(self, package_name: str, overwrite: bool = False): | |||
| new_models = import_model_classes(package_name) | |||
| def register( | |||
There was a problem hiding this comment.
optional nit: shall we enable strict mode by environ.py adding a env variable
There was a problem hiding this comment.
Oh i actually don't know what sglang folks prefer :)
There was a problem hiding this comment.
to be honest I also do not know :) I personally think making env var may be a little bit better b/c then people can use it w/o modifying register code
| @@ -100,6 +102,8 @@ def import_model_classes(package_name: str): | |||
| try: | |||
| module = importlib.import_module(name) | |||
| except Exception as e: | |||
| if strict: | |||
| raise e | |||
There was a problem hiding this comment.
tiny optional nit
| raise e | |
| raise |
(iirc raise e will lose trace)
| def register( | ||
| self, package_name: str, overwrite: bool = False, strict: bool = False | ||
| ): | ||
| new_models = import_model_classes(package_name, strict) |
There was a problem hiding this comment.
optional tiny nit
| new_models = import_model_classes(package_name, strict) | |
| new_models = import_model_classes(package_name, strict=strict) |
|
/tag-and-rerun-ci |
|
btw ping me on slack if I do not reply on gihub - I get too many messages everywhere and can miss things :( |
…n_eagle3_npu * 'main' of https://github.com/sgl-project/sglang: (121 commits) Super tiny add gsp-fast-prepare (sgl-project#14992) Super tiny fix confusing slash_command_handler hint (sgl-project#14976) Super tiny remove unused argument (sgl-project#14966) [registry] Add a strict mode to model registration (sgl-project#14933) Feature/Fix multi lora scheduler blocking issue and evict LoRA None lastly (sgl-project#14795) Tune triton fused moe for the case of glm-4.6-fp8 b200 tp4 (sgl-project#15020) [model-gateway] refactor: unify worker management into modular workflow structure (sgl-project#15010) Update ci permission (sgl-project#15014) Refactor of http and engine entrypoints to allow custom override (sgl-project#14869) Add KV4-capable backend flashmla and update server args (sgl-project#14989) Revert several PRs (sgl-project#14958) Super tiny extract route_typed_request_once (sgl-project#14951) Fix CI by reverting incorrect metric check logic (sgl-project#15004) [model-gateway] refactor: workflow engine cleanup and minor optimization (sgl-project#15001) [model-gateway] fix: handle workflow deadlock and optimize cycle detection (sgl-project#15000) [model-gateway] feat: add DAG parallel execution support and workflow optimization (sgl-project#14999) [model-gateway] refactor: extract workflow engine to src/workflow module (sgl-project#14996) Update CODEOWNERS for multimodal_gen (sgl-project#14995) [diffusion] docker: Tiny fix Docker Hub link in installation documentation (sgl-project#14987) [PD] Add decode PP event loop for PD disaggregation (sgl-project#14945) ... # Conflicts: # python/sglang/srt/model_executor/piecewise_cuda_graph_runner.py
Motivation
The default behavior of model registration is to warn if import model failed. This is a bit inconvenient sometimes as downstream error can be obscure and the original import warning is swallowed by wall of logs. Adding a strict mode to fail fast. This PR does not change default behavior.
Modifications
Accuracy Tests
Benchmarking and Profiling
Checklist