Doc comments: Handle CDATA correctly and avoid normalizing code elements#53140
Doc comments: Handle CDATA correctly and avoid normalizing code elements#53140sharwell merged 11 commits intodotnet:release/dev16.11from
Conversation
8d277b4 to
f3fe0ed
Compare
There was a problem hiding this comment.
I'd certainly prefer a better solution here. The issue here is that XText.Value produces a string with Unix-style line endings (\n), causing the end result to have mixed line-endings 😕
There was a problem hiding this comment.
this warrants a comment explaning what the regex means and waht this is here for.
There was a problem hiding this comment.
I normally use input.Replace("\r\n", "\n").Replace("\n", Environment.NewLine), but XML is required to normalize line endings to \n so I'm guessing you can just use input.Replace("\n", "\r\n") and be done.
There was a problem hiding this comment.
@sharwell I'm a little bit worried if AppendString can be called with a string already containing \r\n, then doing Replace("\n", "\r\n") will duplicate \r.
Is it's guaranteed that XText.Value will always be normalized to \n?
https://www.w3.org/TR/xml/#sec-line-ends seems to claim that. I'll convert to Replace("\n", "\r\n"). Thanks @sharwell.
There was a problem hiding this comment.
This only avoids creating a new array since GetDocumentationParts already returns an immutable array.
There was a problem hiding this comment.
ToImmutableArray() won't clone the array if the input is already an immutable array, but this does avoid boxing the IEnumerable<T> for the call and is cleaner so 👍
There was a problem hiding this comment.
ToImmutableArray() won't clone the array if the input is already an immutable array,
@sharwell Yeah that seem the case:
So the docs of CA2009 are inaccurate/incorrect?
https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca2009
Performance issue: Unnecessary creation of a duplicate immutable collection. The original collection was already immutable and can be used directly.
b94e97b to
a67f650
Compare
|
Ohhhh snap 🎉 |
There was a problem hiding this comment.
❗ Can you add both of these test cases to a new test class IntellisenseQuickInfoBuilderTests_Code.cs, which is modeled after IntellisenseQuickInfoBuilderTests_Styles.
There was a problem hiding this comment.
📝 This is one of the cases that made me uncomfortable, but I'll reserve judgement until I see the new builder tests.
There was a problem hiding this comment.
@sharwell I'm expecting you'll still be uncomfortable after seeing the new tests 😄
Does it make sense to trim line-endings from code elements?
There was a problem hiding this comment.
I'm going to look at screenshots before requesting changes to the actual normalization approach. Maybe there's no problem.
There was a problem hiding this comment.
💭 I would expect this to be rendered like <br/>
Example input:
Example output:
There was a problem hiding this comment.
I'll have to run the code to see if it looks the same with the current form (unless you have a screenshot handy).
There was a problem hiding this comment.
I'll add screenshots shortly.
There was a problem hiding this comment.
While playing around this, I found that indentation of code isn't preserved. This is due to:
I'll open an issue to address this as a follow-up. Not sure if we should simply drop calling TrimEachLine or if it should understand the structure of XML and avoids trimming things inside <code> but trim other lines.
There was a problem hiding this comment.
💭 I thought <c> would be normalized. We should check with the publishing tools to see if this is the case.
There was a problem hiding this comment.
@sharwell This is still code, so I'm expecting it to preserve spaces. But no idea what tools like docfx do.
There was a problem hiding this comment.
The main case where we definitely don't want to keep whitespace is this:
/// <summary>
/// Some <c>code
/// example</c> is here.
/// </summary>The above should render exactly the same as this:
/// <summary>
/// Some <c>code example</c> is here.
/// </summary>It may be worth adding tests to show these are the same.
There was a problem hiding this comment.
I added a new TaggedTextStyle to differentiate <c> and <code>.
Currently Code value is 1000 and InlineCode is 10000. Should InlineCode be 1001 or whatever value to share the same most significant bit? (i.e, indicate that InlineCode is a subset of Code)?
This will allow the following check: (style & TaggedTextStyle.Code) == TaggedTextStyle.Code to pass whether style is Code or InlineCode.
There was a problem hiding this comment.
The current approach seems fine to me.
There was a problem hiding this comment.
On second thought, it seems like the new addition should be:
PreserveWhitespace = 1 << 4,
There was a problem hiding this comment.
Few lines above this, there is:
if (!isInCodeBlock)
text = text.Replace(" ", " ");This doesn't feel correct. See the following test:
@sharwell Do you agree here? Do you want me to fix LSP in this PR or make that a separate PR? (I prefer LSP work to be separate - but no objection doing it in this PR)
cc: @dibarbet
sharwell
left a comment
There was a problem hiding this comment.
Let's change <code> to use TaggedTextStyle.Code | TaggedTextStyle.PreserveWhitespace so the whole enumeration continues to work as individual flags.
|
@sharwell Anything else needed here? |
|
@Youssef1313 We'd like to rebase this onto release/dev16.11. Are you able to do this or would you like us to go ahead and do it? |
97eee47 to
636bc29
Compare
Done. |
|
@jinujoseph for approval to merge |
| // XText.Value returns a string with `\n` as the line endings, causing | ||
| // the end result to have mixed line-endings. So normalize everything to `\r\n`. | ||
| // https://www.w3.org/TR/xml/#sec-line-ends | ||
| static string NormalizeLineEndings(string input) => input.Replace("\n", "\r\n"); |
There was a problem hiding this comment.
why do we need this though? what is wrong with the doc comment having \n in it?
There was a problem hiding this comment.
➡️ This function ensures that non-normalized XML content uses the same end-of-line character as s_newlinePart (defined above). It could be cleaned up in the future (e.g. if we wanted to use Environment.Newline instead), but the approach in this PR keeps the result consistent with current precedent.
| if (node.NodeType == XmlNodeType.Text) | ||
| if (node.NodeType is XmlNodeType.Text or XmlNodeType.CDATA) | ||
| { | ||
| // cast is safe since XCData inherits XText |
There was a problem hiding this comment.
should the above just be if (node is XText xtext) ?
There was a problem hiding this comment.
➡️ Both approaches are fine. There isn't really a downside to the current approach so I prefer to not make a change here unless it's being done along with other changes.
| var rawText = element.Value; | ||
| if ((state.Style & TaggedTextStyle.PreserveWhitespace) == TaggedTextStyle.PreserveWhitespace) | ||
| { | ||
| // Don't normalize code from middle. Only trim leading/trailing new lines. |
There was a problem hiding this comment.
why trim the leading/trailing?
There was a problem hiding this comment.
➡️ This implements a fix which a test revealed:
#53140 (comment)
|
@sharwell Can you close the linked issues? Thanks! |
|
@Youssef1313 Already done thanks 🎉🎉 |
Fixes #53782
Fixes #53135
Fixes #37503
Fixes #22431
Before:
After: