Skip to content

agent: Improve test discovery for refactored callsites with different error semantics #1817

@nathanjmcdougall

Description

@nathanjmcdougall

What

When the agent refactored _file/pyproject_toml/deps.py to use validated_get (which returns a default on both missing keys AND invalid types), it changed the error semantics: the original code raised PyprojectTOMLDepsError on invalid types but returned a default on missing keys. The agent only ran the module's own tests (tests/usethis/_file/) and the wrapper tests, missing the 21 failures in downstream callers (tests/usethis/test_deps.py, tests/usethis/test_init.py).

Where

  • src/usethis/_file/pyproject_toml/deps.pyget_project_deps() and get_dep_groups() were converted from validate_or_raise (raises on invalid type) to validated_get (returns default on invalid type), silently changing behaviour.
  • tests/usethis/test_deps.py — 21 tests that exercise the error-raising path through _deps.py_file/pyproject_toml/deps.py.
  • tests/usethis/test_init.py — additional tests that depend on correct backend dispatch through these paths.

Why it matters

The agent validated changes by running only the directly-related test files, not the transitive callers. When a refactoring changes error semantics (e.g. swallowing errors that were previously raised), the breakage manifests in downstream tests, not in the module's own tests.

Lesson

When replacing a function that has different behaviour for missing-key vs invalid-type (return default vs raise error) with a wrapper that treats both the same way (validated_get), the agent must:

  1. Identify all transitive callers of the changed function (e.g. _deps.py calls _file/pyproject_toml/deps.py).
  2. Run the tests for those transitive callers, not just the direct module tests.
  3. Specifically check whether the original code distinguished between 'missing' and 'invalid' error paths before collapsing them into a single default-returning wrapper.

Suggested improvement

Improve the usethis-python-test-affected-find skill to emphasize how to discover all potentially affected tests whenever a function's error semantics change, rather than relying on the agent's manual judgement about which test files are relevant.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions