Skip to content

[ty] Split ScopedPlaceId into ScopedSymbolId and ScopedMemberId#19497

Merged
MichaReiser merged 17 commits intomainfrom
micha/place-refactor
Jul 25, 2025
Merged

[ty] Split ScopedPlaceId into ScopedSymbolId and ScopedMemberId#19497
MichaReiser merged 17 commits intomainfrom
micha/place-refactor

Conversation

@MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jul 22, 2025

Summary

This PR changes the place structs in the semantic index module to enums over a symbol or a member:

  • ScopedPlaceId: Is now an enum over ScopedSymbolId and ScopedMemberId
  • PlaceExpr: Is now an enum over Symbol and Member
  • PlaceExprRef: Is now an enum over &Symbol and &Member
  • PlaceTable: now separately tracks symbols and members

This idea was initially mentioned in #19470. The hope was that splitting Symbol and Member would help to reduce memory usage because Symbol is only 32 bytes (compared to PlaceExpr, which is 40 bytes).

However, the memory usage is roughly unchanged after running ty on a large repository (the saving must be less than 100MB if there's even any). Splitting symbol and members also introduces some extra memory overhead because all IndexVec over ScopedPlaceId now needs to be two separate vectors.

Performance also shows to be mostly neutral.

The main benefit I now see in this refactor is that it clarifies some usages of places. Before, it was often unclear whether some code iterates over symbols or members or both. In fact, I spent the majority of my time during this refactor on trying to understand which of the two it is. The split also helped me better understand which flags are only supported on flags/members and which flags are supported by both.

Now, whether this distinction makes sense long term heavily depends on whether we expect that many places that currently are only over symbols will need to handle members the same way in the future.

Overall: I don't have a strong opinion on this. I do think it helps clarify some code but it does come at a cost.

Code increase

This PR adds a fair amount of new code. However, most of it is just the boilerplate from having separate Symbol/Member, SymbolTable/MemberTable/PlaceTable and SymbolTableBuilder/MemberTableBuilder/PlaceTableBuilder. Most of that code is fairly simple (many getters)

@AlexWaygood AlexWaygood added the ty Multi-file analysis & type inference label Jul 22, 2025
@MichaReiser MichaReiser force-pushed the micha/place-refactor branch 2 times, most recently from 4d1afec to 3616804 Compare July 22, 2025 21:23
@MichaReiser MichaReiser force-pushed the micha/place-refactor branch from a610975 to 2ed78c7 Compare July 24, 2025 05:44
@github-actions
Copy link
Contributor

github-actions bot commented Jul 24, 2025

mypy_primer results

No ecosystem changes detected ✅
No memory usage changes detected ✅

@MichaReiser

This comment was marked as resolved.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 24, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Copy link
Member Author

@MichaReiser MichaReiser Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write some more docs if we decide to go forward with this PR. The current docs are carried over from PlaceExpr


/// Reference to a node that introduces a new scope.
#[derive(Copy, Clone, Debug)]
pub(crate) enum NodeWithScopeRef<'a> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved everything Scope to scope.rs. That code doesn't need to belong into place.

@MichaReiser MichaReiser marked this pull request as ready for review July 24, 2025 09:55
@AlexWaygood AlexWaygood removed their request for review July 24, 2025 10:01
Comment on lines +144 to +145
symbol_name: Name,
segments: SmallVec<[MemberSegment; 1]>,
Copy link
Member Author

@MichaReiser MichaReiser Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can optimize this further by:

  • Storing a single Name which is the entire path of the member
  • In the SmallVec: Store the kind of each segment and where it starts as a u32.

This doesn't change the size of Member, but it reduces the size of the stored segments because it doesn't require a length and capacity for each segment.

This should be sufficient because all we need is a unique key to hash, knowing if it has the form a.b, and a Display implementation

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a strong improvement in clarity; if it's neutral in performance, I'm +1 to go ahead with it. Thank you!

Comment on lines +2127 to +2244
if self.is_method_of_class().is_some() {
if let PlaceExpr::Member(member) = &mut place_expr {
if member.is_instance_attribute_candidate() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it might be useful to reduce the nesting here by chaining the calls

Copy link
Member Author

@MichaReiser MichaReiser Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me how I would do this without let chains

@dhruvmanila dhruvmanila added the internal An internal refactor or improvement label Jul 25, 2025
@MichaReiser MichaReiser force-pushed the micha/place-refactor branch from cb7bc75 to a56bb11 Compare July 25, 2025 08:37
Copy link
Contributor

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

A minor naming quibble: "member" refers to something like self.x (which seems consistent with how we use that term elsewhere, as an attribute on something), but also to a subscript expression like xs[0]. But no need to change anything.

@MichaReiser
Copy link
Member Author

A minor naming quibble: "member" refers to something like self.x (which seems consistent with how we use that term elsewhere, as an attribute on something), but also to a subscript expression like xs[0]. But no need to change anything.

That's fair. Not sure what else to use. I considered Attribute but that is confusing because only x.a is an attribute access. That's why I landed on Member because, to me, that includes all sort of member access (but maybe that's just me coming from the JS ecosystem where all those expressions are called member expression)

@MichaReiser MichaReiser force-pushed the micha/place-refactor branch from 4abc97b to f14697f Compare July 25, 2025 11:32
@MichaReiser MichaReiser force-pushed the micha/place-refactor branch from e391e8b to 452d7a2 Compare July 25, 2025 11:38
@MichaReiser
Copy link
Member Author

A possible alternative is Subscript. It might be confusing too, but x.a is just a special way of writing the subscript. I'm happy to make that rename if we decide for that name in a separate PR

@MichaReiser MichaReiser merged commit b033fb6 into main Jul 25, 2025
37 checks passed
@MichaReiser MichaReiser deleted the micha/place-refactor branch July 25, 2025 11:54
@carljm
Copy link
Contributor

carljm commented Jul 25, 2025

Yes, the naming issue occurred to me in review also, but I didn't mention it because I didn't have any concrete suggestion that I liked. I don't know of a term that clearly means "either attribute or subscript", and "member" (borrowed from JS) seems not unreasonable (though I would also initially assume it to mean just an attribute).

The clearest options would be the verbose ones: AttributeOrSubscript

@dcreager
Copy link
Member

either attribute or subscript

subscribute 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants