Collection expressions: avoid intermediate List<T> if spread elements have known length by cston · Pull Request #69875 · dotnet/roslyn

cston · 2023-09-10T18:50:21Z

If all spread elements are countable and the collection satisfies certain heuristics, calculate the expected length of the resulting collection and:

If the target is T[], Span<T>, or ReadOnlySpan<T>, allocate the array or span at the expected length to avoid an intermediate buffer.
If the target is List<T>, set the capacity to avoid resizing as items are added.

… have known length

src/Compilers/CSharp/Portable/BoundTree/BoundCollectionExpression.cs

333fred

Done review pass. Haven't looked at the tests in depth yet.

333fred · 2023-09-12T23:19:50Z

src/Compilers/CSharp/Portable/Binder/Binder_Conversions.cs

            }
            else if ((collectionTypeKind == CollectionExpressionTypeKind.ListInterface && isListInterfaceThatRequiresList(targetType)) ||
-                elements.Any(e => e is BoundCollectionExpressionSpreadElement)) // https://github.com/dotnet/roslyn/issues/68785: Avoid intermediate List<T> if all spread elements have Length property.
+                node.GetKnownLength(out _) is null)


Consider naming this parameter for clarity. #Resolved

333fred · 2023-09-12T23:22:34Z

src/Compilers/CSharp/Portable/BoundTree/BoundNodes.xml

    <Field Name="EnumeratorInfoOpt" Type="ForEachEnumeratorInfo?"/>
+    <!-- Collection Length or Count property value. -->
+    <Field Name="LengthOrCount" Type="BoundExpression?" SkipInVisitor="true" Null="allow"/>
+    <!-- Collection element placeholder. -->


It's unclear to me from reading these comments what the difference between this and ExpressionPlaceholder is. Consider elaborating why we have both and what the difference between them is. #Resolved

ExpressionPlaceholder represents the entire collection and ElementPlaceholder represents the current item from the collection. Added a comment where ElementPlaceholder is used.

333fred · 2023-09-12T23:23:11Z

src/Compilers/CSharp/Portable/FlowAnalysis/NullableWalker.cs

+        public override BoundNode? VisitCollectionExpressionSpreadElement(BoundCollectionExpressionSpreadElement node)
+        {
+            base.VisitCollectionExpressionSpreadElement(node);
+            SetResultType(node, default);


Why would this be default? Can it ever actually be null? #ByDesign

Can it ever actually be null?

Is the question: Can BoundCollectionExpressionSpreadElement.Type be null? With this PR, the type of this expression is always null since the type is not used.

It looks like we use the same SetResultType(node, default) for BoundThrowExpression, which seems like a similar case where the BoundExpression.Type is not meaningful.

333fred · 2023-09-12T23:28:24Z

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs

+                var initialization = new BoundArrayInitialization(
+                        syntax,
+                        isInferred: false,
+                        elements.SelectAsArray(e => VisitExpression(e)));


Consider making this static, and passing through this as a parameter, so the lambda can be cached. #Resolved

RikkiGibson

The PR LGTM but it sounds like LDM may need to discuss the exact strategy in use here before we merge.

RikkiGibson · 2023-09-15T19:41:11Z

src/Compilers/CSharp/Portable/BoundTree/BoundNodes.xml

+    <!-- Type is not significant for this node type; always null -->
+    <Field Name="Type" Type="TypeSymbol?" Override="true" Null="always"/>
+    <!-- Collection being spread. -->
    <Field Name="Expression" Type="BoundExpression"/>


"Operand" might be a good name here

RikkiGibson · 2023-09-15T19:47:19Z

src/Compilers/CSharp/Portable/Binder/Binder_Expressions.cs

@@ -4753,30 +4753,39 @@ BoundExpression bindSpreadElement(SpreadElementSyntax syntax, BindingDiagnosticB
                    return new BoundCollectionExpressionSpreadElement(


I am beginning to feel that "spread element" is too ambiguous with the "iterator element" of the spread, the "collection element" of the collection expression, etc. It is making it more difficult to spec and implement the feature. No need to change in this PR, but I'd like to re-examine whether we can make things easier for ourselves by adopting a term like "spread operator" here.

RikkiGibson · 2023-09-15T19:54:54Z

src/Compilers/CSharp/Portable/Binder/Binder_Expressions.cs

                diagnostics.Add(syntax.Expression, useSiteInfo);
-                expression = ConvertForEachCollection(expression, conversion, collectionType, diagnostics);
-                var elementPlaceholder = new BoundValuePlaceholder(syntax.Expression, enumeratorInfo.ElementType);
+                var convertedExpression = ConvertForEachCollection(expressionPlaceholder, conversion, collectionType, diagnostics);


This converts the spread operand to the collection type we found in its EnumeratorInfo. In most cases this really converts the spread operand to the same type, except with arrays, where some wrapping is occurring (reference conv to IEnumerable). Possibly in MQ we could simplify some of this.

RikkiGibson · 2023-09-15T20:04:32Z

src/Compilers/CSharp/Portable/Binder/Binder_Expressions.cs

            {
                return element.Update(
                    BindToNaturalType(element.Expression, BindingDiagnosticBag.Discarded, reportNoTargetType: false),
+                    expressionPlaceholder: element.ExpressionPlaceholder,


I expected the placeholder would always be null in this path. Consider passing null explicitly to make that clear.

src/Compilers/CSharp/Portable/BoundTree/BoundCollectionExpression.cs

RikkiGibson · 2023-09-15T20:09:47Z

src/Compilers/CSharp/Portable/BoundTree/BoundNodes.xml


+  <Node Name="BoundCollectionExpressionSpreadExpressionPlaceholder" Base="BoundValuePlaceholderBase"/>
+
  <Node Name="BoundCollectionExpressionSpreadElement" Base="BoundExpression">


"BoundCollectionExpressionSpreadOperator" might be a better name here

RikkiGibson · 2023-09-15T20:13:31Z

src/Compilers/CSharp/Portable/BoundTree/BoundNodes.xml

-    <Field Name="AddElementPlaceholder" Type="BoundValuePlaceholder?" SkipInVisitor="true" Null="allow"/>
-    <Field Name="AddMethodInvocation" Type="BoundStatement?" SkipInVisitor="true" Null="allow"/>
+    <!-- Statement executed for each collection element. -->
+    <Field Name="IteratorBody" Type="BoundStatement?" SkipInVisitor="true" Null="allow"/>


I personally found it slightly surprising that some of these constructs are synthesized in binding. I figured we would just get symbols for Add method, Length/Count methods, etc., and include them on this node, but actually constructing the statements which add elements to the resulting collection expression would be done in lowering. However, I don't feel strongly enough about the decision to suggest any change here.

RikkiGibson · 2023-09-15T20:15:02Z

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs

            BoundExpression array;
-
-            switch (node.GetKnownLength())
+            if (node.GetKnownLength(hasSpreadElements: out _) is null)


Is this factoring causing us to have to get the known length multiple times? Does that matter?

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter.cs

RikkiGibson · 2023-09-15T21:31:46Z

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs


-            RemovePlaceholderReplacement(addElementPlaceholder);
+            // Rewrite expressions into temporaries.
+            foreach (var element in elements)


Not sure how I feel about this. Given we are creating an array I am not sure it is fine to create potentially very large numbers of temporaries just because a spread element (with runtime-known length) is present. #Resolved

Basically, for this release, I would be absolutely fine with having an intermediate List for these cases. I could easily see a code generator wanting to write something like

public static readonly string[] arr1 = ["bunch", "of", "values", "like", "thousands"]; public static readonly string[] arr2 = [..arr1, "even", "more", "values"];

In this specific case, maybe blowing the stack on a static constructor is less likely, but still maybe possible.

Also, I think in some cases compiler can probably determine that a spread operator like ..arr1 is non-side-effecting to evaluate, e.g. we can read Length before evaluating the elements in-order without a problem, and optimize accordingly. I would def feel comfortable saying that if your Length property itself is side effecting, then bad things may happen, sorry.

Updated to only create temporaries up to the last spread element, and with a small maximum number of temporaries. That will still cover cases such as [..s, e1, e2, <more>] and [..s1, ..s2] in particular.

…ngth

cston · 2023-09-27T20:42:26Z

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs

+            {
+                var rewrittenExpression = RewriteCollectionExpressionElementExpression(elements[i]);
+                BoundAssignmentOperator assignmentToTemp;
+                BoundLocal temp = _factory.StoreToTemp(rewrittenExpression, out assignmentToTemp, isKnownToReferToTempIfReferenceType: true);


Ideally, we shouldn't require a temporary for an element that has a constant value.

333fred

Overall looking good. Would like some investigation of the TryBindLengthOrCount question before signing off though.

333fred · 2023-09-27T22:56:37Z

src/Compilers/CSharp/Portable/Binder/Binder_Attributes.cs

+                {
+                    Binder.Error(diagnostics, ErrorCode.ERR_BadAttributeArgument, node.Syntax);
+                    attrHasErrors = true;
+                    return new TypedConstant(spread.Expression.Type, TypedConstantKind.Error, null);


Consider the naming the last argument. #Resolved

333fred · 2023-09-27T23:16:02Z

src/Compilers/CSharp/Portable/Binder/Binder_Expressions.cs

+                BoundExpression? lengthOrCount;
+                if (!TryBindLengthOrCount(syntax.Expression, expressionPlaceholder, out lengthOrCount, diagnostics))
+                {
+                    lengthOrCount = null;


We probably need a temp diagnostics bag that we add to diagnostics only if we're in the true path here. It looks like this will add use-site diagnostics from Length or Count, if any are found, even if it then returns false some some other reason. #Resolved

Good catch. I've added a test, SpreadElement_LengthUseSiteError, and I think we probably want to report the use-site errors even if TryBindLengthOrCount returns false because the reason it returned false might be due to the same issue that resulted in the use-site error. This is consistent with how TryBindLengthOrCount is used for patterns.

333fred · 2023-09-27T23:17:50Z

src/Compilers/CSharp/Portable/BoundTree/BoundCollectionExpression.cs

+    internal partial class BoundCollectionExpressionBase
    {
-        internal int? GetKnownLength()
+        internal bool HasSpreadElements(out int numberIncludingLastSpread, out bool hasKnownLength)


I think a doc comment with an example of what IncludingLastSpread looks like would be helpful for future maintainability here. #Resolved

cston · 2023-09-29T19:54:43Z

@333fred, @RikkiGibson, please review the latest commit which uses List<T>..ctor(int capacity), thanks.

CyrusNajmabadi · 2023-09-29T19:56:41Z

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs

+                collectionType.OriginalDefinition.Equals(_compilation.GetWellKnownType(WellKnownType.System_Collections_Generic_List_T)))
+            {
+                // List<ElementType> list = new(N + s1.Length + ...);
+                var constructor = ((MethodSymbol)_factory.WellKnownMember(WellKnownMember.System_Collections_Generic_List_T__ctorInt32)).AsMember((NamedTypeSymbol)collectionType);


do you need to look to see if this constructor exists? and only use it if so? or is this ok to just do as is?

From the documentation, this constructor is available on .NET Framework 2.0 and later.

If the compiler doesn't crash when this member is missing, I'm satisfied

We're testing the missing constructor case in KnownLength_List_MissingConstructor().

cston · 2023-10-02T19:49:21Z

@333fred, @RikkiGibson, please review the latest commit which uses List<T>..ctor(int capacity), thanks.

Never mind, I've moved the latest commit to the next PR #70197 instead.

RikkiGibson · 2023-10-04T17:30:21Z

Is this PR ready to merge? @cston @jaredpar

cston · 2023-10-04T17:32:51Z

Is this PR ready to merge? @cston @jaredpar

The changes in the PR have two approvals.

jaredpar · 2023-10-04T17:33:07Z

No, have to go through qb mode on these. Waiting for them to all be double reviewed beforetaking through QB

jcouv · 2023-10-05T18:50:10Z

Already merged elsewhere

cston · 2023-10-05T18:57:41Z

Merged as part of #70197.

ghost added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Sep 10, 2023

cston mentioned this pull request Aug 24, 2023

Implement optimizations for constructing collections from collection literals #68785

Open

17 tasks

cston added the Feature - Collection Expressions label Sep 10, 2023

cston force-pushed the known-length branch 3 times, most recently from ca89062 to 2a7a073 Compare September 11, 2023 22:04

cston marked this pull request as ready for review September 11, 2023 22:09

cston requested a review from a team as a code owner September 11, 2023 22:09

cston requested review from 333fred, CyrusNajmabadi and RikkiGibson September 11, 2023 22:09

Collection expressions: avoid intermediate List<T> if spread elements…

fa59b04

… have known length

cston force-pushed the known-length branch from 2a7a073 to fa59b04 Compare September 11, 2023 22:20

cston added 2 commits September 12, 2023 12:10

Merge remote-tracking branch 'upstream/main' into known-length

e2cf69b

Updates following merge

904e3d4

333fred reviewed Sep 12, 2023

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundCollectionExpression.cs Outdated Show resolved Hide resolved

Fix formatting

e0a0737

333fred reviewed Sep 12, 2023

View reviewed changes

cston added 2 commits September 13, 2023 15:40

Address feedback

696f50d

Merge remote-tracking branch 'upstream/main' into known-length

039a0c1

cston requested a review from a team September 15, 2023 15:35

RikkiGibson self-assigned this Sep 15, 2023

RikkiGibson approved these changes Sep 15, 2023

View reviewed changes

cston mentioned this pull request Sep 18, 2023

Analyzer suggestion IDE0305 ("Collection initialization can be simplified") leads to performance regressions #69988

Closed

cston changed the base branch from main to release/dev17.8 September 26, 2023 04:16

cston added 4 commits September 26, 2023 23:26

Construct internal List<T> instances using well-known members

c2f78db

Limit number of temporaries

df3c63f

Merge remote-tracking branch 'upstream/release/dev17.8' into known-le…

dee3f44

…ngth

Fix attribute test

9815dd4

cston commented Sep 27, 2023

View reviewed changes

Test struct Length property with side effects

4590e93

333fred reviewed Sep 27, 2023

View reviewed changes

Address feedback

2f4cd81

333fred approved these changes Sep 28, 2023

View reviewed changes

RikkiGibson approved these changes Sep 28, 2023

View reviewed changes

cston mentioned this pull request Sep 28, 2023

Collection expressions: use applicable EnsureCapacity() when length is known #70181

Draft

CyrusNajmabadi reviewed Sep 29, 2023

View reviewed changes

cston mentioned this pull request Sep 30, 2023

Collection expressions: optimize List<T> construction #70197

Merged

jaredpar added this to the 17.8 milestone Oct 2, 2023

cston force-pushed the known-length branch from 2564a6f to 2f4cd81 Compare October 2, 2023 19:48

cston changed the base branch from release/dev17.8 to features/CollectionLiterals October 5, 2023 16:50

jcouv closed this Oct 5, 2023

		@@ -4753,30 +4753,39 @@ BoundExpression bindSpreadElement(SpreadElementSyntax syntax, BindingDiagnosticB
		return new BoundCollectionExpressionSpreadElement(


		<Node Name="BoundCollectionExpressionSpreadExpressionPlaceholder" Base="BoundValuePlaceholderBase"/>

		<Node Name="BoundCollectionExpressionSpreadElement" Base="BoundExpression">

Conversation

cston commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

333fred left a comment

Choose a reason for hiding this comment

Uh oh!

333fred Sep 12, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred Sep 12, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cston Sep 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred Sep 12, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cston Sep 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred Sep 12, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RikkiGibson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RikkiGibson Sep 15, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cston Sep 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred left a comment

Choose a reason for hiding this comment

Uh oh!

333fred Sep 27, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred Sep 27, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cston Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

333fred Sep 27, 2023 • edited by cston Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

cston commented Sep 10, 2023 •

edited

Loading

333fred Sep 12, 2023 •

edited by cston

Loading

333fred Sep 12, 2023 •

edited by cston

Loading

cston Sep 13, 2023 •

edited

Loading

333fred Sep 12, 2023 •

edited by cston

Loading

cston Sep 13, 2023 •

edited

Loading

333fred Sep 12, 2023 •

edited by cston

Loading

RikkiGibson Sep 15, 2023 •

edited by cston

Loading

cston Sep 27, 2023 •

edited

Loading

333fred Sep 27, 2023 •

edited by cston

Loading

333fred Sep 27, 2023 •

edited by cston

Loading

cston Sep 28, 2023 •

edited

Loading

333fred Sep 27, 2023 •

edited by cston

Loading

cston commented Oct 2, 2023 •

edited

Loading