Skip to content

Store helpful info needed by incremental-generation on syntax-tree itself#62551

Merged
CyrusNajmabadi merged 90 commits intodotnet:mainfrom
CyrusNajmabadi:syntaxTreeInfo
Jul 13, 2022
Merged

Store helpful info needed by incremental-generation on syntax-tree itself#62551
CyrusNajmabadi merged 90 commits intodotnet:mainfrom
CyrusNajmabadi:syntaxTreeInfo

Conversation

@CyrusNajmabadi
Copy link
Contributor

Followup to #62483. Extracted out as potentially more contentious.

This stores the booleans about whether a syntax-tree contains attributes/global-usings on the SyntaxTree object itself (computed lazily of course). This avoids an intermediary CWT which itself has sizable impact (both the complexity of it, as well as the need to hash). Overall time goes down from 6.6s to 4.1 (a 38% drop). Memory usage also went down in this PR (from 2.88gb to 2.63 gb), though that's due to smarter allocation of an array we need.

CWT drop is from teh removal of hte following items:

image

CyrusNajmabadi and others added 30 commits June 30, 2022 12:52
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review July 12, 2022 20:23
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner July 12, 2022 20:23
@CyrusNajmabadi
Copy link
Contributor Author

@jcouv @RikkiGibson @chsienki this is ready for review. thanks!

Copy link
Member

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Curious whether it is worthwhile to try and optimize further.

@CyrusNajmabadi
Copy link
Contributor Author

LGTM. Curious whether it is worthwhile to try and optimize further.

Do you mean overall? or just this particular area?

NotComputedYet,
None = 1 << 0,
ContainsGlobalAliases = 1 << 1,
ContainsAttributeList = 1 << 2,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RikkiGibson switched to simple enum.

@RikkiGibson
Copy link
Member

LGTM. Curious whether it is worthwhile to try and optimize further.

Do you mean overall? or just this particular area?

I meant in this specific area (checks that need to be performed on syntax trees in each incremental generation pass).

@CyrusNajmabadi
Copy link
Contributor Author

I meant in this specific area (checks that need to be performed on syntax trees in each incremental generation pass).

Ah... so there is one additional check i could do, but i was wary about it. Specifically, we could pre-walk the trees (that have attribute lists in them) and see if the file passes the client's passed in 'predicate' function. Then we would never have to check trees again that we know could never match their predicate.

However, i didn't do this for two reasons:

  1. this will force us to realize the red tree, which i'd like to avoid for the fast-path of this attribute never being found.
  2. in practice the predicates taht i'm creating (in the SDK source generators) are all basically trivial things like "is this is a ClassDeclSyntax" or "is this a MethodDeclSyntax". Those are so common that every file will effectively return 'true', making it mostly pointless to even bother.

Filtering like that is really good when it's either extremely cheap, or if it's expensive, but greatly limits the set of items you need to look at. Otherwise, it's not htat helpful.

That said, other generators may behave differently, so if we have some good examples that could benefit from anotehr filtering pass, we could consider this in the future.

@CyrusNajmabadi
Copy link
Contributor Author

@jcouv ptal :-)

@jcouv jcouv self-assigned this Jul 13, 2022
@CyrusNajmabadi
Copy link
Contributor Author

@333fred @chsienki ptal. thanks!

@CyrusNajmabadi CyrusNajmabadi requested a review from jaredpar July 13, 2022 18:51
/// generation framework to compute and cache data once against a tree so it does not have to go back to source
/// for untouched trees when other trees in the compilation are modified.
/// </summary>
private SourceGeneratorSyntaxTreeInfo _sourceGeneratorInfo = SourceGeneratorSyntaxTreeInfo.NotComputedYet;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually I don't have an issue with what we're doing here. It makes sense to cache hot lookup items like this. Putting it on the SyntaxTree itself doesn't really bother me at all. In fact it seems like we could even expand on this to maintain this field across incremental parses (maybe).

My brain is mildly upset at the terminology of source generator though. It feels too specific. Yes that's the primary usage today but the attribute lookup is could be more general purpose. It would allow us to do othere optimizations in the code base.

Did you consider giving it aless specific name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider giving it aless specific name?

I def did, but fell back into that flip-flopping of: "this is only for source-generators today", vs "maybe other things in the future could benefit...."

What would honestly make a ton of sense would be if we had SyntaxTree cache the DeclTable entry for it, and then allow the decl table entry to have these additional bits, as well as the other bits it holds onto today. However, this would require making things common between C# and VB and would def be larger in scope.

So this was basically a way to solve this problem in a very tiny, scoped fashion. I def think tehre may be more cohesive approaches we coudl take, but they would also be much larger in scope and i figured would also take ages to get into the compiler :) I'd like to put SG work to bed (this is literally my last PR for this area), so i went with narrow and targetted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact it seems like we could even expand on this to maintain this field across incremental parses (maybe).

we def could (though it's a bit non-trivial). The general idea would be to do something similar to how we store the bits for 'ContainsDiagnostics' and 'ContainsAnnotations'. Nodes could have a bitset provided to them of interesting bits. As you build the parent nodes, they union those bits together, and you then have all that information available in the root node.

This then makes incremental fall out trivially. If we make a new AttributeList/GlobalUsing node we set that bit, and pass it upwards.

I explored this, but it was much larger in scope and it's terrifying from a perf perspective to touch the green nodes at all. We squeezed perf heavily there, and it's a space where you could change a line and easily cause +- large perf percentages. So i def didn't want to go that far when i'm really ok just storing two bits here.

@CyrusNajmabadi CyrusNajmabadi merged commit e8dcc65 into dotnet:main Jul 13, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the syntaxTreeInfo branch July 13, 2022 23:08
@ghost ghost added this to the Next milestone Jul 13, 2022
333fred added a commit to 333fred/roslyn that referenced this pull request Jul 14, 2022
* upstream/main:
  Bind IterationErrorVariable to a natural type for error recovery (dotnet#62601)
  [main] Update dependencies from dotnet/source-build-externals (dotnet#61173)
  Cleanup: remove VisualStudioWorkspaceImpl.TryGetContainedDocument (dotnet#62616)
  EnC: A couple of small fixes (dotnet#62454)
  Report error for ref field declared in type other than ref struct (dotnet#62615)
  Store helpful info needed by incremental-generation on syntax-tree itself (dotnet#62551)
adamperlin pushed a commit to adamperlin/roslyn that referenced this pull request Jul 19, 2022
…self (dotnet#62551)

* Cache the hash in compilation options

* Remove

* Update src/Compilers/Core/Portable/InternalUtilities/Hash.cs

* Update src/Compilers/Core/Portable/InternalUtilities/Hash.cs

* Update src/Compilers/Core/Portable/InternalUtilities/Hash.cs

* Fix api

* Walk green-nodes in incremental-generator attribute-finding path

* Use green nodes

* REmove

* Fix tests

* Avoid allocating builder if not needed

* Simplify

* Only loop once

* Do not look for load/reference directives when there can't be any

* REmove use of Lazy in DeclarationTable for rarely used values.

* Remove lazy

* In progress

* Change the data-flow path for searching for attributes to prevent fewer intermediary tables.

* Remove unused type

* Simplify

* Update tests

* Make internal

* Update

* Update src/Compilers/Core/Portable/SourceGeneration/Nodes/NodeStateTable.cs

* Update comment

* Filter large tables before processing htem.

* Update tests

* Sort

* Fix grammar

* Fix assert

* Add back in

* Revert

* Simplify

* remove

* Simplify

* Remove

* Simplify global aliases

* Prefilter to only trees that have attributes or globalaliases in them

* INcrease

* Only check the top level for global usings.

* Remove comment

* Simplify

* Add docs

* Revert

* Make static

* Extract common code

* PR feedback

* Use helper

* update run counts

* Chain compilations

* Fix grammar

* Don't use implicit construction

* Fix

* Tweak benchmark

* Simplify walk

* Cache syntax-tree-info on the syntax-tree itself.

* no clean

* Fast path detecting what sort of using construct we have.

* REvert

* revert

* Cleanup

* Simplify
@allisonchou allisonchou modified the milestones: Next, 17.4 P1 Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants