Conversation
mavasani
left a comment
There was a problem hiding this comment.
We already have a very similar analyzer recommendating “Use Any when applicable over Count”. See https://github.com/dotnet/roslyn-analyzers/blob/master/src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/DoNotUseCountWhenAnyCanBeUsed.cs and https://docs.microsoft.com/en-us/visualstudio/code-quality/ca1827?view=vs-2019 and https://docs.microsoft.com/en-us/visualstudio/code-quality/ca1828?view=vs-2019.
It would be much simpler to extend it to add the new rules instead of creating a separate analyzer/fixer that duplicates majority of the code.
Codecov Report
@@ Coverage Diff @@
## master #3584 +/- ##
==========================================
- Coverage 94.67% 94.65% -0.03%
==========================================
Files 1074 1075 +1
Lines 244939 245459 +520
Branches 14787 14847 +60
==========================================
+ Hits 231899 232339 +440
- Misses 11006 11072 +66
- Partials 2034 2048 +14 |
|
Few things to note:
|
|
Thanks @mavasani, I appreciate the feedback and recommendations. We had previously looked at the "Use Any when applicable over Count" analyzer, but our assessment at that time was that it was targeting different code and using different heuristics, but from your analysis here, I see how there is overlap in the experience. Generalizing this group of analyzers into a single analyzer is intriguing. I think we'd need to do some more design work to figure out the right path forward on that though.
|
|
I was scratching my head trying to see how we could merge the 4 rules all together in order to diagnose one rule at most as you mention and this is what I got. First, I identified the rules and the types affected by them; the next table tries to illustrate that.
Second, I tried to sketch a routine that could allow us to funnel into returning one rule following this principles:
And this is what I got. The comments below each condition says what are the possible rules that can be returned from that code branch. @mavasani @jeffhandley Would you consider this good enough? @mavasani Should we re-arrange the sequence of Rule IDs so the 4 rules (CA1827-CA1829 & CA1836) show as consecutive numbers since they are somewhat related?
I have the concern that this might keep happening even if we merge the rules. List<string> list = new List<string>();
bool isEmpty = list.Count() == 0;As of today, this snippet triggers both rules CA1827 and CA1829 but I think that is because there are two offending expressions which are two separate analysis, one for Is there a way to query the current diagnosis results so we can either avoid emitting rule CA1829 if the expression is within the expression that triggered CA1827 OR remove diagnosis of CA1829 if CA1827 encompasses it? |
|
Thanks @jozkee - that seems to match pretty much what I had envisioned. You can ensure that we do not report duplicate diagnostics by writing separate analysis callbacks for method invocation and property reference as follows. NOTE: I think CA1829 should be relaxed so semantics are equivalent to CA1836, i.e. flags Count() invocation if there is any valid property named Length/Count on containing type. |
That is tricky, as we have already released an FxCop analyzers package with CA1827-CA1829. Users might have added source suppressions or ruleset/editorconfig entries to turn off the diagnostic, and we don't want them to be broken by us changing IDs of shipped rules. We have faced similar situations in the past, and come to terms with the fact that having non-sequential IDs is better then changing existing rule IDs from an end user perspective as it avoids breaking change for users. |
|
@jozkee Let me know if you have any other questions, we can discuss offline as well. Feel free to ping me anytime on teams. |
|
@mavasani thanks for replying! Few more questions:
If we relax it, then the rule wouldn't be proper to As per the documentation (Section 4.ii):
However, since the rule already shipped, it wouldn't be wise to just move it to Also, since we are planning on merging API specific rules (CA1827 and CA1828) with non-API specific (CA1829 if relaxed, and CA1836), in which project should those rules exist? |
I would not worrying about the sub-package/folder that the rule belongs to - feel free to move the analyzer(s) around so all analysis for these related rules is in a single analyzer. End users will either install FxCop analyzers package, which includes both |
I think it makes sense to keep them in |
|
I am almost finishing up with merging the analyses, and I found a couple of bugs in the current analyzer implementation for rule CA1827/28.
I will fix those two cases in the upcoming commits, I just wanted to point them out in this thread to keep track in case is needed, and also ask if there is a tracking issue for this that we may close after merging this, I doubt it since those look like very uncommon cases. |
|
@jozkee Thanks - awesome! |
I don't think these are known issues. |
|
Recap:
All tests are passing with the exception of a few ones related to relaxation of rule 1829, (will call those out individually as a review comment). |
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/MicrosoftNetCoreAnalyzersResources.resx
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/MicrosoftNetCoreAnalyzersResources.resx
Outdated
Show resolved
Hide resolved
|
@mavasani Can you grant me permission to add more reviewers or add below names yourself? |
mavasani
left a comment
There was a problem hiding this comment.
Two primary concerns in the analyzer:
- When user code has
Count(), why are we not preferring CA1829 (Use property Count/Length) over CA1827 (Use Any())? - Why are we recommending CA1827 (Use Any()) for user code which has efficient property access, say
a.Count != 0ora.Length != 0?
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
| if (parentOperation is IBinaryOperation parentBinaryOperation) | ||
| { | ||
| AnalyzeParentBinaryOperation(context, propertyReferenceOperation, parentBinaryOperation, | ||
| isAsync: false, methodName: null); | ||
| } | ||
| // Analize invocation operation, potentially obj.Count.Equals(0). | ||
| else if (parentOperation is IInvocationOperation parentInvocationOperation) | ||
| { | ||
| AnalyzeParentInvocationOperation(context, propertyReferenceOperation, parentInvocationOperation, | ||
| isInstance: true, isAsync: false, methodName: null); | ||
| } | ||
| // Analize argument operation, potentially 0.Equals(obj.Count). | ||
| else if (parentOperation is IArgumentOperation argumentOperation) | ||
| { | ||
| AnalyzeParentInvocationOperation(context, propertyReferenceOperation, (IInvocationOperation)argumentOperation.Parent, | ||
| isInstance: false, isAsync: false, methodName: null); | ||
| } |
There was a problem hiding this comment.
Is it possible to share this code between this method and prior method?
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
| { | ||
| ReportCA1836(context, useRightSideExpression, shouldNegateIsEmpty, parent); | ||
| } | ||
| else |
There was a problem hiding this comment.
Why are we not checking if (TypeContainsVisibleProperty(context, type, Count, SpecialType.System_Integer)) and if (TypeContainsVisibleProperty(context, type, Length, SpecialType.System_Integer)) when in context of Count() callback and recommending use of Count or Length versus directly moving to recommending Any()? Isn't the former more efficient?
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/UseCountProperly.cs
Outdated
Show resolved
Hide resolved
| return true; | ||
| } | ||
|
|
||
| private static bool TypeContainsVsibileProperty(OperationAnalysisContext context, ITypeSymbol type, string propertyName, SpecialType propertyType) |
There was a problem hiding this comment.
This seems like a very good candidate to become an extension method on ITypeSymbol:
public static bool TypeContainsVisibleProperty(this ITypeSymbol type, string propertyName, SpecialType propertyType, INamedTypeSymbol contextSymbol)
I was thinking that the right thing to do was to replace the code with its most semantically correct version ( I do understand your point that a property check is better that an |
I think given this is a performance rule, property access would be a better suggestion. I am also fine with having a user configurable option here to allow users to choose if they prefer performant code or readable code in this space. I will let you, @jeffhandley and the runtime team decide the preferred default semantics here. |
If there's an IsEmpty, it's almost certainly better. We should add Any() => !IsEmpty and !Any() => IsEmpty into the list of things to correct. (LINQ Any() is better than LINQ Count(), may be worse than implemented Count() and is almost definitely worse than property Count/Length/IsEmpty) |
New rule for IsEmpty takes priority over the rule for Any() so that should be already addressed in this PR.
Better in the sense of performance? @bartonjs so you would suggest to place rule "Use Property Count/Length instead of LINQ Count()" above "Use LINQ Any() instead of LINQ Count()"? |
Yeah. Any() has to spin up an iterator, try to pump it, then throw it away. A property defined on the type is (assuming it follows guidelines) not going to do anywhere near that much work, or have GC impact.
Yep. I think the best-to-worst order for "has stuff or not" is
The Count() instance method could arguably float up one, but it's probably just trading one kind of complication with another vs LINQ Any() (otherwise it would have been a property). |
|
OK, I did the proper changes to prefer CA1829 over CA1827, except when the offending method contains a predicate e.g. |
| } | ||
| else | ||
| { | ||
| if (TypeContainsVisibileProperty(context, type, IsEmpty, SpecialType.System_Boolean)) |
There was a problem hiding this comment.
Typo
| if (TypeContainsVisibileProperty(context, type, IsEmpty, SpecialType.System_Boolean)) | |
| if (TypeContainsVisibleProperty(context, type, IsEmpty, SpecialType.System_Boolean)) |
|
|
||
| private static void AnalyzeCountInvocationOperation(OperationAnalysisContext context, IInvocationOperation invocationOperation) | ||
| { | ||
| ITypeSymbol? type = invocationOperation.GetInstanceType(); |
There was a problem hiding this comment.
Should we just bail out here if type is null?
There was a problem hiding this comment.
That's a good point, I don't think so, I think we should find some other way to get type when there is no instance, but as I said on #3584 (comment), I haven't been able to get null here, even with static members.
There was a problem hiding this comment.
OK, I was able to get null from GetInstanceType with the following snippet:
public class Test
{
public static int Count {get;}
public bool IsEmpty;
public bool Foo => Count > 0;
}So, we would be bailing just with static properties and static methods, static extension methods does still have an instance so those are fine.
Is it OK to not report on that case?
There was a problem hiding this comment.
I think it seems fine to bail out on static cases.
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/PreferIsEmptyOverCount.Fixer.cs
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/PreferIsEmptyOverCount.Fixer.cs
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/PreferIsEmptyOverCount.Fixer.cs
Outdated
Show resolved
Hide resolved
src/NetAnalyzers/Core/Microsoft.NetCore.Analyzers/Performance/PreferIsEmptyOverCount.Fixer.cs
Show resolved
Hide resolved
mavasani
left a comment
There was a problem hiding this comment.
Thanks! Looks great to me. Thanks a lot for taking the trouble of cleaning up and accumulating all the existing rules in this space. It is truly appreciated!
We must report only CA1836 for PropertyReference Add condition to avoid report when a Property is not named Count or Length Address feedback
|
@mavasani Thanks, just a few more changes to fix some bugs that I found addressing your feedback.
|
| [InlineData("Me.Count > (0)", "Not IsEmpty")] | ||
| [InlineData("(Me.Count) > (0)", "Not IsEmpty")] | ||
| // TODO: Reduce suggested fix to avoid special casing here. | ||
| [InlineData("((Me).Count) > (0)", "Not (Me).IsEmpty")] |
There was a problem hiding this comment.
For VB only, I had to manually specify the "expected fixed code" since depending on the parenthesis and on having the Me keyword, the result may change.
Why is this not a problem for C#? If you see above test CSharpTestFixOnParentheses, I always get the most "reduced" version when verifying the code fix.
…nto prefer_isempty
…nto prefer_isempty
| var propertyReferenceOperation = (IPropertyReferenceOperation)context.Operation; | ||
|
|
||
| string propertyName = propertyReferenceOperation.Member.Name; | ||
| if (propertyName != Count && propertyName != Length) |
…nto prefer_isempty
Fixes dotnet/runtime#33801
This PR adds a rule to identify access to members called "Count" of type int/uint/long/ulong on a valid logical binary operation when compared to 0 or 1 and whose type also contains a visible boolean "IsEmpty" property.