refactor: remove ModelProviderRegistry.default and require ModelConfig.provider#625
Conversation
|
MkDocs preview: https://95f6d2d7.dd-docs-preview.pages.dev Fern preview: https://nvidia-preview-pr-625.docs.buildwithfern.com/nemo/datadesigner
|
Review: PR #625 —
|
Greptile SummaryThis PR completes the removal of the deprecated
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/model_provider.py | Removes ModelProviderRegistry.default, all related validators, and get_default_provider_name(); get_provider now requires a concrete name; resolve_model_provider_registry drops default_provider_name parameter; clean removal with correct pydantic extra-ignore docstring added. |
| packages/data-designer-config/src/data_designer/config/models.py | Makes ModelConfig.provider a required str; removes _warn_on_implicit_provider validator and the deprecation-warning import; docstring updated to reflect required status. |
| packages/data-designer/src/data_designer/cli/repositories/model_repository.py | Adds LegacyModelConfigMigrationError and a pre-validation raw-dict scan (_aliases_missing_provider) that raises a descriptive error when on-disk entries omit provider; file-load failure path correctly split from validation failure path. |
| packages/data-designer/src/data_designer/interface/data_designer.py | Removes YAML-default consulted path; resolve_model_provider_registry called without default_provider_name; all warnings.catch_warnings suppression blocks removed; simplified provider init to a single line. |
| packages/data-designer/src/data_designer/cli/utils/agent_introspection.py | Removes default_provider, configured_provider, and effective_provider fields from introspection state; now reports only provider per item; adds LegacyModelConfigMigrationError handling in _load_registry to surface migration errors as a structured AgentIntrospectionError. |
| packages/data-designer/src/data_designer/cli/forms/model_builder.py | Replaces silent provider=None fallback with two distinct ValueError raises — one for no providers configured, one for multiple providers without a selection — correctly addressing the previously reported misleading error message. |
| packages/data-designer/src/data_designer/cli/services/provider_service.py | Removes set_default, get_default, and the registry.default tracking in update/delete; remaining logic is clean and correct. |
| packages/data-designer/src/data_designer/cli/repositories/provider_repository.py | Drops default field from CLI-layer ModelProviderRegistry; removes deprecation-warning emission; relies on pydantic v2 extra=ignore to silently drop legacy default: key, with that invariant documented and tested. |
| packages/data-designer/src/data_designer/cli/utils/agent_text_formatter.py | Removes default_provider: header line from formatted model-alias output; _format_table simplified by removing column_labels parameter now that provider is the direct column key. |
| packages/data-designer/tests/interface/test_data_designer.py | Removes ~175 lines of deprecation-path tests; replaces with a single clean YAML-fallback test; all deprecated-provider construction paths correctly removed. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Before["Before (deprecated path)"]
A1["ModelConfig(alias=..., model=...)"] -->|"provider=None"| B1["ModelProviderRegistry.get_provider(None)"]
B1 -->|"fallback to registry.default or providers[0]"| C1["ModelProvider resolved"]
end
subgraph After["After (this PR)"]
A2["ModelConfig(alias=..., model=..., provider='openai')"] -->|"provider required"| B2["ModelProviderRegistry.get_provider('openai')"]
B2 -->|"exact name lookup"| C2["ModelProvider resolved"]
B2 -->|"name not found"| E2["UnknownProviderError"]
end
subgraph CLI["CLI model_repository.load()"]
F["load_config_file()"] --> G{{"provider field present & non-null?"}}
G -->|"No"| H["LegacyModelConfigMigrationError (descriptive message)"]
G -->|"Yes"| I["ModelConfigRegistry.model_validate()"]
I -->|"OK"| J["Registry returned"]
I -->|"Other error"| K["return None (silent)"]
end
subgraph Compat["Legacy YAML Compatibility"]
L["model_providers.yaml with 'default: foo'"] -->|"pydantic extra=ignore"| M["ModelProviderRegistry loaded, default key dropped"]
end
Reviews (6): Last reviewed commit: "fix: block model config writes when lega..." | Re-trigger Greptile
| else: | ||
| provider = None | ||
| raise ValueError( | ||
| "Cannot build ModelConfig: no provider specified in form data and no " | ||
| "available providers configured. Add a provider first via " | ||
| "`data-designer config providers`." | ||
| ) |
There was a problem hiding this comment.
The error message incorrectly states "no available providers configured" in a branch that also fires when multiple providers ARE configured but none was included in
form_data. When len(self.available_providers) > 1 and no "provider" key is present, the user sees a misleading diagnostic telling them to add a provider when providers already exist.
| else: | |
| provider = None | |
| raise ValueError( | |
| "Cannot build ModelConfig: no provider specified in form data and no " | |
| "available providers configured. Add a provider first via " | |
| "`data-designer config providers`." | |
| ) | |
| else: | |
| if self.available_providers: | |
| raise ValueError( | |
| "Cannot build ModelConfig: multiple providers are configured but none " | |
| "was specified in form data. Specify provider= explicitly on each ModelConfig." | |
| ) | |
| raise ValueError( | |
| "Cannot build ModelConfig: no provider specified in form data and no " | |
| "available providers configured. Add a provider first via " | |
| "`data-designer config providers`." | |
| ) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer/src/data_designer/cli/forms/model_builder.py
Line: 294-299
Comment:
The error message incorrectly states "no available providers configured" in a branch that also fires when multiple providers ARE configured but none was included in `form_data`. When `len(self.available_providers) > 1` and no `"provider"` key is present, the user sees a misleading diagnostic telling them to add a provider when providers already exist.
```suggestion
else:
if self.available_providers:
raise ValueError(
"Cannot build ModelConfig: multiple providers are configured but none "
"was specified in form data. Specify provider= explicitly on each ModelConfig."
)
raise ValueError(
"Cannot build ModelConfig: no provider specified in form data and no "
"available providers configured. Add a provider first via "
"`data-designer config providers`."
)
```
How can I resolve this? If you propose a fix, please make it concise.| model: str | ||
| inference_parameters: InferenceParamsT = Field(default_factory=ChatCompletionInferenceParams) | ||
| provider: str | None = None | ||
| provider: str |
There was a problem hiding this comment.
Claude and Codex both caught a migration edge here: once provider is required, an existing model_configs.yaml with one provider-less entry makes ModelRepository.load() return None, so CLI and agent flows can treat the whole model file as missing. Could we add a migration/error path in ModelRepository.load() so users see "add provider to alias X" instead of silently losing the registry?
|
|
Looks like the release gate in the PR body is still active: latest GitHub release is |
andreatgretel
left a comment
There was a problem hiding this comment.
Requesting changes based on the migration issue and stale docs noted above. The release gate in the PR body also still looks active.
f97431c to
eab4026
Compare
…g.provider (#590) Final cleanup after the #589 deprecation cycle. Providers are now a named list referenced by name end-to-end — no hidden registry default, no YAML default leaking into DataDesigner construction (#588), and no ambiguity about which provider wins. ModelConfig.provider becomes a required str. ModelProviderRegistry.default and the associated validators, get_default_provider_name() helper, CLI "Change default provider" workflow, and the "Default" column in `data-designer config list` are removed. resolve_model_provider_registry() drops its default_provider_name kwarg, and DataDesigner.__init__ no longer threads a YAML default through registry construction. Pydantic v2's extra="ignore" default lets existing on-disk YAMLs with default: keys load cleanly, so users on the deprecation cycle migrate without breakage.
eab4026 to
09eeae7
Compare
|
Thanks for the careful review. I pushed an updated branch and believe the comments are now addressed:
The branch is rebased onto current |
|
hey, sorry for the slow re-review here, I was out on PTO and this slipped. One migration edge still looks open: with an older |
Raise LegacyModelConfigMigrationError from ModelRepository.load() when on-disk model_configs.yaml entries lack a required provider field, so CLI and agent flows fail fast instead of treating the file as empty and overwriting legacy aliases on add.
|
Thanks for catching that — pushed a fix in 5615847.
Call sites that load the registry surface it directly:
Coverage added in |
Summary
Final cleanup after the #589 deprecation cycle. Providers are now a named list referenced explicitly by each
ModelConfig: no hidden registry default, no YAMLdefault:flowing intoDataDesigner.__init__, and no ambiguity about which provider wins.The original release gate is now satisfied:
v0.6.0shipped on 2026-05-13 and includes #594, so users have had a release containing the deprecation warnings before this removal lands.Related Issue
Closes #590. Depends on #588 (merged as #591) and #589 (merged as #594, released in
v0.6.0).Changes
Removed
ModelProviderRegistry.default, default-provider validators, andget_default_provider_name()from the engine registry.get_default_provider_name()helper.set_default/get_defaultservice methods, default-tracking in update/delete, the "Change default provider" menu option, and the "Default" column fromdata-designer config list.default_providerand per-itemconfigured_provider/effective_providerfields from agent introspection output in favor of a single explicit per-itemproviderfield.Changed
ModelConfig.providerrequired.ModelProviderRegistry.get_provider(name)to require a concrete provider name.resolve_model_provider_registry(providers)to dropdefault_provider_name.DataDesigner.__init__so it resolves providers without consulting YAML defaults.provideras required, and call out the upgrade impact for stalemodel_configs.yamlentries and agent context consumers.Compatibility
model_providers.yamlfiles with a legacydefault:key still load cleanly because bothModelProviderRegistryclasses inherit directly frompydantic.BaseModel, preserving pydantic v2's defaultextra="ignore"behavior. This invariant is now documented on the classes and covered by tests.model_configs.yamlentries that omitprovideror set it tonullmust be updated with an explicit provider name.Attention Areas
Reviewers should pay special attention to:
packages/data-designer-config/src/data_designer/config/models.py: public API change,ModelConfig.provideris now required.packages/data-designer-engine/src/data_designer/engine/model_provider.py: registry-level default routing is removed.packages/data-designer/src/data_designer/cli/utils/agent_introspection.py: agent context schema now reports onlyproviderper model alias item.Testing
PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer/tests/cli/forms/test_model_builder.py packages/data-designer-engine/tests/engine/test_model_provider.py packages/data-designer/tests/cli/repositories/test_provider_repository.py packages/data-designer/tests/interface/test_data_designer.py packages/data-designer-config/tests/config/test_models.py packages/data-designer-config/tests/config/test_default_model_settings.py -q(162 passed before the final docs-wording amend)PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev ruff check ...PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev ruff format --check ...fern check: not run locally because thefernCLI is not installed in this environmentChecklist