Log assembly loads during task run#8316
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
89c89d3 to
b32cdde
Compare
|
Don't forget to increment the file format version: I'd say don't worry about the out of proc task node - it's a black box to us anyway, let's consider it a "tool" that we don't control. Make sure AppDomain stuff works on Core (Core doesn't have AppDomains) or gracefully degrades. I'm torn on whether we should include the built-in assembly loads. On one hand, we want to know where our own stuff is loaded from. On the other hand, we can already infer since we know the path to MSBuild.exe, and maybe not worth having the same redundant messages at the beginning of every binlog. I'd say let's record a couple of medium-sized real-world binlogs when the PR is ready, add viewer support, look at the logs in the viewer and then tweak to achieve the desired behavior. Also remember that we should submit the binlog viewer support for the new file format version first, test it with the test binlogs, I will publish the new viewer version, then we merge this PR. So that by the time early adopters start recording these binlogs the viewer already supports them. See a sample PR by @edvilme that incremented the binlog version: and the corresponding viewer PR: |
|
Forgot to say, the general direction looks good to me! |
rainersigwald
left a comment
There was a problem hiding this comment.
I think my big question is whether there's any noticeable perf cost. If there is, is it worth the additional information in general, or should we build an opt-in?
We shouldn't be loading assemblies all that much, so I'm hopeful we can just have it on . . .
|
|
||
| if (e.BuildEventContext == null && e is AssemblyLoadBuildEventArgs) | ||
| { | ||
| return; |
There was a problem hiding this comment.
Should we not log this in the text log, too? It's generally less helpful than binlogs but still has some place, and is (sadly) still popular.
There was a problem hiding this comment.
tl;dr;: I believe it's out of scope currently.
It logs with exception of assembly loading for custom loggers. Those do not have any BuildContext that ParallelConsoleLogger relies upon quite a lot. It would take some refactoring of ParallelConsoleLogger, which is definitely doable but feels beyond scope of this change.
If it would be missed - it can definitely be added later on.
Thoughts?
src/Build/BackEnd/Components/RequestBuilder/AssemblyLoadsTracker.cs
Outdated
Show resolved
Hide resolved
| namespace Microsoft.Build.BackEnd.Components.RequestBuilder | ||
| { | ||
| internal class AssemblyLoadsTracker : MarshalByRefObject, IDisposable | ||
| { |
There was a problem hiding this comment.
Does this type run on both core and framework? Heavy use of AppDomain makse me think framework but not seeing #if to remove it from core builds.
There was a problem hiding this comment.
It's core and framework tested.
That being said - the dictionary with app domains is really needed just on FW - so putting this behing #ifdef
Rest I believe needs to stay as is - but it might be my lack of knowledge - for tracking assembly loads I need AppDomain (even if it's just the default one on core) to subscribe - correct?
src/Build/BackEnd/Components/RequestBuilder/AssemblyLoadsTracker.cs
Outdated
Show resolved
Hide resolved
| { | ||
| s_instances.AddOrUpdate(appDomain, tracker, (_, loadsTracker) => loadsTracker); | ||
| } | ||
| tracker.StartTracking(); |
There was a problem hiding this comment.
Note: consider the case where the same AppDomain is registered twice (the crux of my comment above). This code actually produces a subtle race condition. There is a period of time inside the call to StartTracking that assembly loads are not monitored and that means we could end up missing an entry in the log.
There was a problem hiding this comment.
This convo was marked as resolved but I can't see how. The race condition still exists.
There was a problem hiding this comment.
Thanks for reiterating this one!
Since the StartTracking instance method was private and was invoked only after construction - there was no chance for double subscription case - so the unsubscribe-subscribe was superfluous and I removed it.
But it's a good point for future cases as well
| /// </summary> | ||
| private async Task<WorkUnitResult> InitializeAndExecuteTask(TaskLoggingContext taskLoggingContext, ItemBucket bucket, IDictionary<string, string> taskIdentityParameters, TaskHost taskHost, TaskExecutionMode howToExecuteTask) | ||
| { | ||
| using var assemblyLoadsTracker = AssemblyLoadsTracker.StartTracking(taskLoggingContext, AssemblyLoadingContext.TaskRun); |
There was a problem hiding this comment.
Why not be specific about AppDomain.CurrentDomain here to make it explicit it's part of the tracking?
There was a problem hiding this comment.
I have adjusted the code not to store and transfer AppDomain name and id for default app domain (so never on Core and only for explict app domains on Framework).
So I'd probably still prefer to keep the majority use case simpler (being able to specify the AppDomain explicitly but not being forced to).
Thoughts?
|
Coresponding Binlog viewer PR: KirillOsenkov/MSBuildStructuredLog#653 |
| var loggers = new List<ILogger>(); | ||
|
|
||
| var outVerbosity = verbosity; | ||
| ProcessBinaryLogger(binaryLoggerParameters, loggers, ref outVerbosity); |
There was a problem hiding this comment.
I reordered loggers initialization to have binlog on top - to ensure it receives all the events from initialiation of custom loggers.
@rainersigwald - is there any hidden risk associated to reordering the creation and initialiation of the loggers
|
I'm wondering if we could simplify this. When do we expect to load assemblies? This PR seems to just say that when we load assemblies, we should log it but does that with broad strokes. Could we have something that targets the places where we load an assembly and log that an assembly was loaded at those places? So for tasks, for instance, you could put things here and here. |
|
@Forgind the .NET assembly loader(s) load assemblies just-in-time to provide definitions for methods that can be called in a method as it is JITted before its first run. Because of that, we cannot control when assemblies will be loaded beyond the plugin entrypoints we explicitly control. I originally was thinking of logging only the resolved paths of plugin entrypoints, but this approach is more comprehensive and enables debugging collisions between transitive dependencies of plugins, so it seems superior to me. |
|
This change is mostly about runtime induced (implicit) loading. For this reason (closer explained by @rainersigwald) we unfortunately need to mount observing for the entire execution time of the component in question. For loggers and sdkresolvers, the typeloader should actually be the additional point in code hooked (I choosed not to hook, as for those two components we hopefully should know where we are loading from and this way I do not need to complicately distinguish which type loading is of interest and which to fileter out). Additionaly for the loggers the code obvserves only the initialization section - in hopes all the assemblies will be touched there. But I can imagine esoteric loggers that will use additional assemblies only during handling of specific events or even shutdown - this is intentionally not supported (as it opens a can of worm of distniguishing what caused assembly load at specific point of time - e.g. when custom task is logging to custom logger). tl;dr;: please let me know if you think the assembly loads caused by loading the actual types implementing the custom loggers or sdkresolvers should be logged as well (in addition to the current hooked usages of the components.) |
I think this is important. One way things can go wrong is a build that has references to two versions of a build task with the same identity. If they're loaded in separate worker nodes they can both work, but when loaded in the same process the first one "wins" the load and might cause behavior differences. |
The task case is hooked withing The One another question that pops-up during testing this: can we somehow distinguish builtin tasks? (while possibly special case keep handling some that are used as surrogate for custom tasks - like |
|
Can we just check which assembly it's from? If it's in a fixed list, then it's us. I think we have a "fixed" list somewhere in MSBuildLocator |
|
For tasks I think it's literally just |
|
I'm slowly peeling the onion here :-) Currently I'm waist down in the code and experimentation - trying to have it ready for PR review meeting |
src/Build/BackEnd/Components/RequestBuilder/AssemblyLoadsTracker.cs
Outdated
Show resolved
Hide resolved
src/Build/BackEnd/Components/RequestBuilder/AssemblyLoadsTracker.cs
Outdated
Show resolved
Hide resolved
| { | ||
| s_instances.AddOrUpdate(appDomain, tracker, (_, loadsTracker) => loadsTracker); | ||
| } | ||
| tracker.StartTracking(); |
There was a problem hiding this comment.
This convo was marked as resolved but I can't see how. The race condition still exists.
src/Build/Evaluation/Evaluator.cs
Outdated
| // Combine SDK path with the "project" relative path | ||
| try | ||
| { | ||
| // Is the sdk resolver a custom type? |
There was a problem hiding this comment.
Aren't SDK resolvers in different assemblies? We have our built-in one, but NuGet owns theirs, C++ owns theirs, etc.
There was a problem hiding this comment.
That's a good point - I believe that if it's not built-in, than the default tracking has a sense, as it might not be obvious from where it was loaded. (similarly to this comment about roslyn tasks: #8316 (comment)).
What do you think?
Btw. this code was accidentaly performin the filtering twice - I removed the superfluous call wrapping the StartTracking, as unified (and bit more comprehensive) filtering is performed within StartTracking.
It filters as well based on the typename - https://github.com/dotnet/msbuild/pull/8316/files/f6c696399c13667bc7742a8e94c0cfc3c79e46aa#diff-7f217b4ccb76592a2ab78dd7f8332509dfa2e2216d8a6032f956cbdf48df8d03R86-R96
There was a problem hiding this comment.
I’d rather err on the side of logging known assemblies than filtering something out too aggressively.
There was a problem hiding this comment.
There is a opt-in trait (via MSBUILDLOGALLASSEMBLYLOADS env var) to stop any filtering.
So if for any reason any component of interest would not be hooked (e.g. due to being placed to 'Microsoft.Build' namespace), it can allways be opted-in.
|
I've published the viewer version that supports this, would be nice to test that it's able to open binlogs produced with this change. |
|
@KirillOsenkov I confirm I can open my test logs (new console, orchard, etc.) with the latest official version 👍
Btw. a dumb question - Were there any thoughts to make the auto-update experience more friendly or understandable? I observed that the first open attempt led to error due to an old version, without simple info about update option/process. The Help->About menu presented a clear message about a new version and need to restart the app. Can this be perceived as confusing? |
|
Yeah we can certainly improve this. One of these days! |
Since I complained :-) - let me try to propose a fix ... |

Fixes #6303
Related: KirillOsenkov/MSBuildStructuredLog#653
Context
AppDomainIsolatedTaskOutOfProcessRaskHostNode- as neitherLoggingServicenorLoggingContextare available there - not sure if not supporting those is a problem.Microsoft.Buildassembly)Microsoft.Buildassembly orwithinMicrosoft.Build.Loggingnamespace) - Note - loggers initialiation reordered - Binlog is the first one - as assembly load events are emited at the time of each logger initialization, currentlyLoggerServicedispatches events immediately after construction, so as a result order of logger definitions matter (only the earlier initialized gets messages about latter ones initialization assembly loads). Alternative would be to change theLoggerServicecontract to start dispatching on explicit request (easy to implement and preserving current MSBuild behavior - however might be potentially breaking for API users?) or to cache those events and log them later (doable, but convoluted - probably not worth the efforts now)Samples
task run:
evaluation:
logger initialization:
custom sdk resolver (no sample - similar to logger initialization)
Perf impact
No observable impact.
MSBUILDLOGALLASSEMBLYLOADSis used to turn on loggin of all assembly loadsmsbuild.exe <proj> /nodeReuse:false /bl