Skip to content

Conversation

@nxtn
Copy link
Contributor

@nxtn nxtn commented Dec 3, 2020

... with Encoding.Default since they are the same.

private static readonly UTF8Encoding.UTF8EncodingSealed s_defaultEncoding = new UTF8Encoding.UTF8EncodingSealed(encoderShouldEmitUTF8Identifier: false);

@ghost
Copy link

ghost commented Dec 3, 2020

Tagging subscribers to this area: @eiriktsarpalis
See info in area-owners.md if you want to be subscribed.

Issue Details

... with Encoding.Default since they are the same.

private static readonly UTF8Encoding.UTF8EncodingSealed s_defaultEncoding = new UTF8Encoding.UTF8EncodingSealed(encoderShouldEmitUTF8Identifier: false);

Author: NextTurn
Assignees: -
Labels:

area-System.Console

Milestone: -

@eiriktsarpalis
Copy link
Member

Thank you for your contribution.

If I may ask, what is the motivation for introducing this PR? While it is true that Encoding.Default happens to have the same configuration as the currently used encoding, how does the change clarify the code or improve performance? (apart from eliminating one allocation).

Note that Encoding.Default actually returns an instance of Utf8EncodingSealed. Seems like it's functionally equivalent to the base class (although I defer to @GrabYourPitchforks on that). So perhaps the devirtualization benefits alone might justify this change.

@nxtn nxtn force-pushed the encodingdefault branch 2 times, most recently from ae14d9f to d8a73b8 Compare December 3, 2020 15:00
@nxtn
Copy link
Contributor Author

nxtn commented Dec 3, 2020

It's just to eliminate the allocations, I think.

@huoyaoyuan
Copy link
Member

It's just to eliminate the allocations, I think.

The allocations are static, not instance. No need to remove.

To be meaningful, there may be new static instances for UTF8BOM and UTF8NoBOM.

@nxtn
Copy link
Contributor Author

nxtn commented Dec 4, 2020

The allocations are static, not instance. No need to remove.

In this PR, yes. What do you think of those non-cached allocations among the libraries?

@stephentoub
Copy link
Member

stephentoub commented Dec 4, 2020

What do you think of those non-cached allocations among the libraries?

It is not worth changing that CanonicalXml code to remove the UTF8Encoding allocation, as a) the code would need to be complicated to differentiate Windows from Unix (assuming you mean replacing it with Encoding.Default), b) that Encoding allocation is dwarfed by the rest of the allocation happening in that method and its consumers, and c) it's not on a hot path.

@eiriktsarpalis eiriktsarpalis merged commit 9c41693 into dotnet:master Dec 4, 2020
@nxtn nxtn deleted the encodingdefault branch December 4, 2020 16:57
@GrabYourPitchforks
Copy link
Member

Not an issue in this PR, but in general, Encoding.Default may be different across different runtimes. For example, on .NET Framework it's usually initialized to the system code page and isn't guaranteed to be UTF-8. Since all of the affected packages in this PR target only netcore, we're ok.

In general I'm a fan of readability / understandability, even at the expense of a single allocation. If the code requires UTF-8 for correctness, it should be hardcoded against an API that has "utf8" somewhere in the API name.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants