Skip to content

[ty] Initial implementation of declaration and definition providers.#19371

Merged
UnboundVariable merged 10 commits intoastral-sh:mainfrom
UnboundVariable:goto_definition
Jul 16, 2025
Merged

[ty] Initial implementation of declaration and definition providers.#19371
UnboundVariable merged 10 commits intoastral-sh:mainfrom
UnboundVariable:goto_definition

Conversation

@UnboundVariable
Copy link
Collaborator

@UnboundVariable UnboundVariable commented Jul 15, 2025

This PR implements "go to definition" and "go to declaration" functionality for name nodes only. Future PRs will add support for attributes, module names in import statements, keyword argument names, etc.

This PR:

  • Registers a declaration and definition request handler for the language server.
  • Splits out the goto_type_definition into its own module. The goto module contains functionality that is common to goto_type_definition, goto_declaration and goto_definition.
  • Roughs in a new module stub_mapping that is not yet implemented. It will be responsible for mapping a definition in a stub file to its corresponding definition(s) in an implementation (source) file.
  • Adds a new IDE support function definitions_for_name that collects all of the definitions associated with a name and resolves any imports (recursively) to find the original definitions associated with that name.
  • Adds a new VisibleAncestorsIter stuct that iterates up the scope hierarchy but skips scopes that are not visible to starting scope.

UnboundVariable added 2 commits July 15, 2025 11:40
…finition

* 'main' of https://github.com/astral-sh/ruff: (39 commits)
  [ty] Sync vendored typeshed stubs (astral-sh#19368)
  Fix typeshed-sync workflow (astral-sh#19367)
  Rework typeshed-sync workflow to also add docstrings for Windows- and MacOS-specific APIs (astral-sh#19360)
  [ty] Allow `-qq` for silent output mode (astral-sh#19366)
  [ty] Allow `-q` short alias for `--quiet` (astral-sh#19364)
  Add shellcheck to pre-commit (astral-sh#19361)
  distinguish references from definitions in `infer_nonlocal`
  [`pycodestyle`] Handle brace escapes for t-strings in logical lines (astral-sh#19358)
  [ty] Combine CallArguments and CallArgumentTypes (astral-sh#19337)
  Move Pylint rendering to `ruff_db` (astral-sh#19340)
  [`pylint`] Extend invalid string character rules to include t-strings (astral-sh#19355)
  Make TC010 docs example more realistic (astral-sh#19356)
  Move RDJSON rendering to `ruff_db` (astral-sh#19293)
  [`flake8-use-pathlib`] Skip single dots for `invalid-pathlib-with-suffix` (`PTH210`) on versions >= 3.14 (astral-sh#19331)
  [`ruff`] Allow `strict` kwarg when checking for `starmap-zip` (`RUF058`) in Python 3.14+ (astral-sh#19333)
  [ty] Reduce false positives for `TypedDict` types (astral-sh#19354)
  [ty] Remove `ConnectionInitializer` (astral-sh#19353)
  [ty] Use `Type::string_literal()` more (astral-sh#19352)
  [ty] Add ecosystem-report workflow (astral-sh#19349)
  [ty] Make use of salsa `Lookup` when interning values (astral-sh#19347)
  ...

# Conflicts:
#	crates/ty_ide/src/goto.rs
#	crates/ty_server/src/server.rs
@github-actions
Copy link
Contributor

github-actions bot commented Jul 15, 2025

mypy_primer results

No ecosystem changes detected ✅
No memory usage changes detected ✅

@codspeed-hq
Copy link

codspeed-hq bot commented Jul 15, 2025

CodSpeed Instrumentation Performance Report

Merging #19371 will not alter performance

Comparing UnboundVariable:goto_definition (c258942) with main (cbe94b0)

Summary

✅ 41 untouched benchmarks

@AlexWaygood
Copy link
Member

Merging #19371 will degrade performances by 5.26%

I think that's just a race condition due to #19328 being merged at exactly the wrong time -- if you rebase this PR, it should go away

@AlexWaygood AlexWaygood added server Related to the LSP server ty Multi-file analysis & type inference labels Jul 15, 2025
@UnboundVariable
Copy link
Collaborator Author

To potential reviewers... Most of the changes in this PR are straightforward and probably not worth your time reviewing in any great detail. Here are the specific areas where I would most appreciate review and feedback:

  1. In semantic_index.rs, I added a VisibleAncestorsIter struct to walk up the scope hierarchy but skip the scopes that are not visible to the starting scope. I would have expected to see functionality like this already, but I couldn't find it. Let me know if I should be using something else here.
  2. I added a function definitions_for_name in ide_support.rs. Does this look like the right abstraction and separation of concern? I tried a couple of different options, but some of my other attempts required exposing way more public structs and methods from ty_python_semantic to the ty_ide crate.
  3. I added a new module resolve_definition that is used by definitions_for_name. It contains code that "resolves" import declarations -- following them recursively to locate the original source declaration. Does this look like the right approach?

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

At a high level, my main question mark is the extent to which this adds a new and totally separate reimplementation of logic to model Python semantics which already exists in a different form in our type inference. In this PR, that's mostly definitions_for_name, which sort-of-duplicates TypeInferenceBuilder::infer_name_load. But as we flesh out e.g. attribute access, I suspect there will be even more of this duplication.

On the one hand, this duplication will likely (and already does, in this PR) lead to some subtleties being modeled correctly in one case and not in another case, leading to inconsistencies between the goto feature and type inference.

On the other hand, that may not be a big deal, and this may still be the best tradeoff. The LSP goto features can ignore some complexities (narrowing constraints) entirely, and (at least currently) make some other major simplifications (always considering all reachable bindings and declarations in every considered scope, even if some could not reach the selected name expression). And the clearest implementation alternative (packaging up the full set of discovered Definitions with every Place load in type inference, so that we can reuse the type-inference code) could have significant performance implications for type inference, even when goto features aren't being used.

Regarding your 3 reviewer questions, I think I answered (1) in an inline comment, (2) is discussed both inline (regarding specific limitations of the new function) and above (regarding the higher-level question of whether we re-implement or re-use the type-inference logic), and re (3) I think the new ResolvedDefinition machinery looks reasonable.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank. I don't have much to add other than I agree with @carljm's summary: It be curious to hear your thoughts on duplicating some of the semantic behavior for the LSP vs trying to enrhich ty_python_semantic itself to avoid the code duplication.

@carljm
Copy link
Contributor

carljm commented Jul 16, 2025

@UnboundVariable and I discussed this today in person and felt that it makes sense to continue with the current approach, even though it involves some of what looks like duplication, because in fact the duplication is "shallower" than it appears; there are a lot of behavior differences between type inference and collecting definitions for goto-definition. (One upcoming example with attributes is that goto-definition actually shouldn't run the descriptor protocol, which is a big part of our type-inference attribute-access code.) Having to integrate all of these differences into the type-inference code as optional flags will increase complexity and slow down implementing the LSP features, as well as making it harder to make future changes to the behavior. In other words, we feel this is a case where it makes sense to accept some duplication in exchange for greater flexibility.

@UnboundVariable UnboundVariable merged commit fae0b5c into astral-sh:main Jul 16, 2025
37 checks passed
@UnboundVariable UnboundVariable deleted the goto_definition branch July 16, 2025 22:07
break;
}

// If marked as nonlocal, skip current scope and search in ancestor scopes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to skip current scope here? I realize that there's a broader TODO above about respecting nonlocal/global writes from arbitrary other scopes (which I think is doable with some non-trivial changes to semantic indexing), but it seems like we could easily (and should) include definitions from the same scope in which it is declared nonlocal? (Same question applies re global handling above)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With language server functionality like "go to declaration", it's not clear what behavior is "correct". It's somewhat subjective, and we will undoubtedly need to iterate on all of the language server features in response to user feedback. As you and I discussed in person, my recommended approach is to use Microsoft's language server (pylance) as a starting point to inform the initial behavior. Pylance has already gone through several years of iteration and user feedback, so it's probably a good starting point for ty.

Another general principle with "go to declaration" and "go to definition" is that it's probably not a good idea to return different results for different usages of a symbol. In other words, if you "go to definition" from any use of a symbol, it should display the same list of definitions.

Based on these principles, the "correct" solution here is to address the TODO that I left in the code. Once we do that, we will want to skip the current scope here. In other words, I wrote the code assuming that we will eventually fix the TODO. If you have strong opinions about how this should work in the interim — or prefer that I prioritize fixing the TODO now, let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Not sure how to weight "prefer always returning the same set of definitions" vs "don't fail to return a local definition that is highly likely to be relevant" in the short term, but I agree the right answer is to fix the TODO and consider all definitions. I think the main thing blocking this is to adjust semantic indexing so it does a "breadth-first" rather than "depth-first" walk (defers walking function bodies), so that we can be confident of the target scope for a nonlocal declaration (because we know we've seen all possible bindings in enclosing scopes). But we have plenty to do, so I'm not going to suggest this definitely should be the next priority. It may also make sense for someone more familiar with the existing semantic indexing to tackle that.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great.

My only concern with duplicating the logic is that it will be very easy to only make the change in one implementation and not in the other. Is there a way that we could run the new logic as part of our mdtests so that at least a test is failing if I only update one implementation but not the other (it would also give us a lot of test coverage to start from and track the remaining TODOs)

UnboundVariable added a commit that referenced this pull request Jul 18, 2025
…keyword arguments (#19417)

This PR builds upon #19371. It addresses a few additional code review
suggestions and adds support for attribute accesses (expressions of the
form `x.y`) and keyword arguments within call expressions.

---------

Co-authored-by: UnboundVariable <unbound@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

server Related to the LSP server ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants