Add support for embedding arbitrary files into binlog#6339
Add support for embedding arbitrary files into binlog#6339benvillalobos merged 3 commits intomainfrom
Conversation
It would be very useful to embed more than just the project files into the binlog. For example, project.assets.json is a common request: #3529. global.json or nuget.config would be other examples. Some users may want to embed all *.cs files as well. Add a special EmbedInBinlog item type recognized by BuildEventArgsWriter. If after evaluation the items contain EmbedInBinlog, all such items will be logged if the evaluated include points to a full file path that exists on disk. A given file can be mentioned more than once, but will only be embedded once. If a file doesn't exist or is empty, it will not be embedded. If the item is added from a target (after evaluation is done), it will not be embedded. Checking for EmbedInBinlog is done in BuildEventArgsWriter to reuse the complicated and performance-optimized logic of iterating over all items.
We're thinking of adding a way to embed arbitrary files in the binlog. See dotnet/msbuild#6339 project.assets.json is commonly requested to be embedded. This would fix dotnet/msbuild#3529 Binlog size does increase from 3.5 MB -> 5 MB, so we'll need to think perhaps about making it off by default.
|
I've tested this with embedding project.assets.json: Surprisingly, there's no perceptible slowdown of builds: I'm guessing all the extra json file compressing cost is hidden on a background thread: msbuild/src/Build/Logging/BinaryLogger/ProjectImportsCollector.cs Lines 81 to 83 in 9bcc06c Binlog size does increase however from 3.5 MB to 5 MB (the size of the files blob increases from 400 K to 1.9 MB). I suspect that for larger builds the size increase will grow slower than linearly:
We'll need to think about whether to have it off by default. |
|
I did more testing with project.assets.json on Roslyn.sln incremental and the build times are:
Binlog size increased from 15.8 MB -> 21.1 MB. The inner files archive increased from 1.3 MB -> 6.7 MB. Note that this PR by itself has no adverse effects on either build speed or binlog size. The effects only show with dotnet/sdk#16840 |
| private const string EmbedInBinlogItemType = "EmbedInBinlog"; | ||
|
|
||
| private void CheckForFilesToEmbed(string itemType, object itemList) | ||
| { |
There was a problem hiding this comment.
Would it make sense to check the EmbedFile event for null here and bail right away?
There was a problem hiding this comment.
In theory, but in our scenario we know it’s not null, so probably doesn’t matter either way. I was just using the ?.Invoke pattern out of habit.
There was a problem hiding this comment.
Apologies for scrutinizing and hope I'm not misreading the code. EmbedFile is subscribed to always, yes, but the handler EventArgsWriter_EmbedFile is a conditional no-op. And that condition looks real, i.e. projectImportsCollector does not always get created. Are you sure we can't hoist the check and avoid running the loop here in some scenarios?
There was a problem hiding this comment.
Oh, I see what you're saying. Sounds good.
| private const string EmbedInBinlogItemType = "EmbedInBinlog"; | ||
|
|
||
| private void CheckForFilesToEmbed(string itemType, object itemList) | ||
| { |

It would be very useful to embed more than just the project files into the binlog. For example, project.assets.json is a common request: #3529. global.json or nuget.config would be other examples. Some users may want to embed all *.cs files as well.
Add a special EmbedInBinlog item type recognized by BuildEventArgsWriter. If after evaluation the items contain EmbedInBinlog, all such items will be logged if the evaluated include points to a full file path that exists on disk. A given file can be mentioned more than once, but will only be embedded once. If a file doesn't exist or is empty, it will not be embedded. If the item is added from a target (after evaluation is done), it will not be embedded.
Checking for EmbedInBinlog is done in BuildEventArgsWriter to reuse the complicated and performance-optimized logic of iterating over all items.
This PR is a prerequisite for #3529