Add compiler-generated code dataflow analysis#2842
Conversation
- Add comments - Inline defaultValue - Treat all fields on compiler-generated types as hoisted locals
Also add more tests to cover these cases, and hoisted method parameters in state machine methods.
|
@vitek-karas I think this is ready - PTAL! @jtschuster I would also appreciate your review since you have experience with the locals tracking logic. |
- Fix formatting - Remove invalid assert - Remove unused code - Inline unneeded local variables - Add another test showing imprecise warnings
|
Regarding the potential holes with reflection access to closure fields:
|
| HashSet<MethodDefinition> scannedMethods = new HashSet<MethodDefinition> (); | ||
| // Note that the default value of a hoisted local will be MultiValueLattice.Top, not UnknownValue.Instance. | ||
| // This ensures that there are no warnings for the "unassigned state" of a parameter. | ||
| // Definite assignment should ensure that there is no way for this to be an analysis hole. |
There was a problem hiding this comment.
This might hold for the analyzer, but doesn't hold for linker (there are other compilers producing IL which may not implement definite assignment).
I'm fine with the current behavior for now, but we should probably file an issue on this at least.
That said it's probably not a big problem:
Runtime behavior of such case (unless we get something wrong) is that the field will have its default value (so typically null) - which typically means the analysis will ignore/skip it anyway.
- Add comment on HoistedLocalKey - Track method bodies in interprocedural state - Make interprocedural state lattice a static proprety - Add type hierarchy warnings for compiler-generated fields - Optimize warning logic to avoid extra annotation checks
- Track state machine methods in interprocedural state, not in scan.
src/linker/Linker.Steps/MarkStep.cs
Outdated
| if (member is MethodDefinition method && ShouldWarnForReflectionAccessToCompilerGeneratedCode (method, warnsOnRUC, warnsOnReflection)) { | ||
| var id = reportOnMember ? DiagnosticId.DynamicallyAccessedMembersOnTypeReferencesCompilerGeneratedMember : DiagnosticId.DynamicallyAccessedMembersOnTypeReferencesCompilerGeneratedMemberOnBase; | ||
| Context.LogWarning (origin, id, type.GetDisplayName (), method.GetDisplayName ()); | ||
| } |
There was a problem hiding this comment.
This needs another comment which basically says that the skipWarningsForOverride=true can only happen on MoveNext like methods in compiler generated code, that it will never happen for local functions and alike. Meaning that it's never going to happen that we decided to not warn above and would also skip warning here.
Reason: Technically the logic here is wrong, the warnsOnReflection is true regardless if we produced warning above or not (due to override auto-suppression) and that will suppress producing the compiler-generated warning here. But it will never actually matter. So I'm fine keeping the code as-is, but I think this needs good explanation.
There was a problem hiding this comment.
I added a comment and also tried to make this more clear by renaming the variables to isReflectionAccessCoveredByRUC/DAM. There's still a naming issue because the skipWarningsForOverride check isn't part of ShouldWarnWhenAccessedForReflection (so ShouldWarn says true, but then we still might not warn) - but I don't want to clean all of that up here.
vitek-karas
left a comment
There was a problem hiding this comment.
Can't wait for all of this light up :-)
- Add more detaled comments about virtual override logic for type hierarchy marking and interactions with the compiler-generated code warnings - Clean up related code to make it slightly clearer
This treats fields of display classes and state machines as hoisted locals. We track all assignments to hoisted locals within a method group (the set of compiler-generated code reachable from a given user method). The analysis is technically flow-insensitive, because it assumes that any assignment to a hoisted local can reach any read of the same local. This will produce extra warnings in some cases, but it prevents holes: - State will "flow" out of nested functions. That is, writes (to hoisted locals) within nested functions will reach reads in the enclosing user method - Lambdas are analyzed at the point of delegate conversion, but with all possible states of captured variables. So effectively, writes after the lambda declaration will reach reads within the lambda. Previously, hoisted locals were treated as unannotated, so they would produce dataflow warnings if they reached an annotated location. Now that we analyze hoisted locals, cases where the value satisfies requirements at the point of consumption won't warn. This means that accessing these fields (representing hoisted locals) via reflection is problematic, since it could mutate the values of these fields and invalidate the correctness analysis. For this reason we now warn on reflection access to compiler-generated fields. To prevent noise, we only warn for reflection access to compiler-generated fields that represent types which may be annotated - so Type, string, etc. - but not int. This is technically a hole because ints also participate in dataflow analysis but we are choosing this tradeoff to avoid excess warnings for integers. This also includes some cleanup of the type hierarchy logic and extra comments to make it more clear how this interacts with warnings for reflection access to compiler-generated code. Commit migrated from dotnet/linker@bc46e44
Still a work in progress, but feel free to take an early look!edit: After some discussion I changed the approach. See old notes below for the earlier approach.
This treats fields of display classes and state machines as hoisted locals. We track all assignments to hoisted locals within a method group (the set of compiler-generated code reachable from a given user method). The analysis is technically flow-insensitive, because it assumes that any assignment to a hoisted local can reach any read of the same local. This will produce extra warnings in some cases, but it closes the holes in the original approach:
Previously, hoisted locals were treated as unannotated, so they would produce dataflow warnings if they reached an annotated location. Now that we analyze hoisted locals, cases where the value satisfies requirements at the point of consumption won't warn. This means that accessing these fields (representing hoisted locals) via reflection is problematic, since it could mutate the values of these fields and invalidate the correctness analysis. For this reason we now warn on reflection access to compiler-generated fields.
To prevent noise, we only warn for reflection access to compiler-generated fields that represent types which may be annotated - so
Type,string, etc. - but notint. This is technically a hole because ints also participate in dataflow analysis but per discussion with @vitek-karas I think we agree that this is an ok tradeoff.Old notes:
This flows state through hoisted locals in state machine methods, and into nested functions (lambdas/local functions). Some caveats to be aware of:
@vitek-karas @agocke @MichalStrehovsky I'm curious to hear your opinion on whether we should try to plug those holes, or if this approach is good enough.