Skip to content

Allow BatchNodes to reuse prior computed arrays if they would generate hte same value#62169

Merged
CyrusNajmabadi merged 21 commits intodotnet:mainfrom
CyrusNajmabadi:pooling3
Jun 30, 2022
Merged

Allow BatchNodes to reuse prior computed arrays if they would generate hte same value#62169
CyrusNajmabadi merged 21 commits intodotnet:mainfrom
CyrusNajmabadi:pooling3

Conversation

@CyrusNajmabadi
Copy link
Contributor

@CyrusNajmabadi CyrusNajmabadi commented Jun 27, 2022

Followup to #62100

This drops memory allocations by around 500 MB:

image

to

image

The reason for this is that we're constantly producing the exact same GlobalAlias[] array as the result of our .Collect call in the incremental generator. The change here lets us peek back at the prior array and then reuse it if we would produce the same.

Here we literally go from 5000 allocations to a single one that is shared from that point onwards. Only if the global aliases actually change would we need to perform a new allocation.

@ghost ghost added the Area-Compilers label Jun 27, 2022
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review June 28, 2022 00:06
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner June 28, 2022 00:06
sourceValuesBuilder.Builder.SequenceEqual(previousTable.Single().item, EqualityComparer<TInput>.Default))
{
return (previousTable.Single().item, sourceInputs);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is the new logic that reuses from the previousTable if possible.

@CyrusNajmabadi CyrusNajmabadi requested a review from 333fred June 28, 2022 00:08
@CyrusNajmabadi CyrusNajmabadi requested a review from 333fred June 28, 2022 00:28
@CyrusNajmabadi
Copy link
Contributor Author

@jcouv @dotnet/roslyn-compiler for eyes. Thanks!

@CyrusNajmabadi
Copy link
Contributor Author

@chsienki for eyes. Thanks!

@CyrusNajmabadi CyrusNajmabadi requested review from 333fred and cston June 29, 2022 22:40
@CyrusNajmabadi
Copy link
Contributor Author

@cston have taken your suggested approach. will comment inline with what's going on. thanks!


if (entry.State != EntryState.Removed)
entryCount++;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass determiens how many entries we have. note: this will allocate an IEnumerable. If we worry about the cost of that, we could have a dedicated method on SourceTable (called something like "CountOfNonRemovedEntries").


return (result, sourceInputs);

ImmutableArray<TInput> tryReusePreviousTableValues(int entryCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does non-allocating pass looking to see if we can reuse the prior array. it checks the lengths are the same then checks all non-removed items against the prior array. if a match, we can reuse.

}

return newTable.ToImmutableAndFree();
ImmutableArray<TInput> computeCurrentTableValues(int entryCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally, we do a pass where we allocate a builder. however, importantly, we know the exact amount of items we need, so the builder doesn't create a scratch array that has the chance of not being pooled if too large.

NodeStateTable<ImmutableArray<TInput>>.Builder newTable)
{
// Do an initial pass to both get the steps, and determine how many entries we'll have.
var sourceInputsBuilder = newTable.TrackIncrementalSteps ? ArrayBuilder<(IncrementalGeneratorRunStep InputStep, int OutputIndex)>.GetInstance() : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we initialize this with sourceTable.Count as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. that would be a reasonable approximation. The only reasons i didn't were that:

  1. it's not exactly correct. Tables have entries, and entries have steps. So it's quite possible to have way more steps tahn .Count returns.
  2. TrackingIncrementalSteps is done in tests, so being that efficient is not a pressing concern on my part. It doesn't show up in my benchmarks either as i'm not turning on that capability :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how much simpler this is. Just one question.

@CyrusNajmabadi CyrusNajmabadi enabled auto-merge June 30, 2022 00:57
@CyrusNajmabadi CyrusNajmabadi merged commit 67d572d into dotnet:main Jun 30, 2022
@ghost ghost added this to the Next milestone Jun 30, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the pooling3 branch June 30, 2022 02:36
@allisonchou allisonchou modified the milestones: Next, 17.4 P1 Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants