Simplify 'interpolation' data, and move to an easier to consume System.Range approach for it#57966
Conversation
| internal partial class LanguageParser | ||
| { | ||
| /// <summary> | ||
| /// "Safe" substring using start and end positions rather than start and length. |
There was a problem hiding this comment.
First, 'safe' is crazy here. We should never be in a situation where we need to munge positions. Everything shoudl be known with exact locations and we should always be able to safely use the data collected. This approach to specifying inclusive end positions, but then rectifying them is just bad.
Second, using inclusive for start/end just violated all of our (and the BCL) patterns around ranges/spans. Nothing else works this way (esp. string slicing), so we really don't want to do this.
| return (last > s.Length || len <= 0) ? string.Empty : s.Substring(first, len); | ||
| } | ||
| private static string Substring(string str, TextSpan span) | ||
| => str.Substring(span.Start, span.Length); |
There was a problem hiding this comment.
No fuss, no muss. You ask for a slice, you get it.
| // with no inserts. We must still use String.Format to get its handling of escapes such as {{, | ||
| // so we still treat it as a composite format string. | ||
| var text = Substring(originalText, openQuoteIndex + 1, closeQuoteIndex - 1); | ||
| var text = Substring(originalText, TextSpan.FromBounds(openQuoteIndex + 1, closeQuoteIndex)); |
There was a problem hiding this comment.
no need for weird -1. THis is the standard inclusive/exclusive slice logic.
| var text = Substring(originalText, | ||
| TextSpan.FromBounds( | ||
| i == 0 ? openQuoteIndex + 1 : interpolations[i - 1].CloseBraceSpan.End, | ||
| interpolation.OpenBraceSpan.Start)); |
There was a problem hiding this comment.
instead of needing +1 and -1 on the curly locations, we can actually just use known concepts like .End and .Start to specify what we care about. (we should probably do this for the openQuote part as well, but i haven't had need to just yet).
| return CheckFeatureAvailability(result, MessageID.IDS_FeatureInterpolatedStrings); | ||
| } | ||
|
|
||
| private static InterpolationSyntax ParseInterpolation(CSharpParseOptions options, string text, Lexer.Interpolation interpolation, bool isVerbatim) |
There was a problem hiding this comment.
made static so it's clear that you can't call instance methods both on the original 'parser' instance and the 'tempParser' created within.
| using (var tempLexer = new Lexer(Text.SourceText.From(parsedText), this.Options, allowPreprocessorDirectives: false, interpolationFollowedByColon: interpolation.HasColon)) | ||
| var openBraceToken = this.EatToken(SyntaxKind.OpenBraceToken); | ||
| var (expression, alignment) = getExpressionAndAlignment(); | ||
| var (format, closeBraceToken) = getFormatAndCloseBrace(); |
There was a problem hiding this comment.
complex interpolation parsing was broken into it's three successive segments. The parts taht return pairs of items do so as there's a relation between those two pieces (esp. wrt to how trivia is associated with either part).
| #endif | ||
| return result; | ||
|
|
||
| (ExpressionSyntax expression, InterpolationAlignmentClauseSyntax alignment) getExpressionAndAlignment() |
There was a problem hiding this comment.
diff is awful. i'm not sure a good way to review before/after. i recommend just trying to understand the after part.
|
@RikkiGibson @jcouv @chsienki this is ready for review. |
| // with no inserts. We must still use String.Format to get its handling of escapes such as {{, | ||
| // so we still treat it as a composite format string. | ||
| var text = Substring(originalText, openQuoteIndex + 1, closeQuoteIndex - 1); | ||
| var text = originalText[new Range(openQuoteIndex + 1, closeQuoteIndex)]; |
There was a problem hiding this comment.
no need for -1, Range math is sane math.
Note: we might want to move the open-quote part to be a range as well, but i haven't needed that so far.
| var text = Substring(originalText, (i == 0) ? (openQuoteIndex + 1) : (interpolations[i - 1].CloseBracePosition + 1), interpolation.OpenBracePosition - 1); | ||
| var text = originalText[new Range( | ||
| i == 0 ? openQuoteIndex + 1 : interpolations[i - 1].CloseBraceRange.End, | ||
| interpolation.OpenBraceRange.Start)]; |
There was a problem hiding this comment.
range math is much saner. you can just use .Start and .End easily and have things just make sense.
There's a bit of an asymmetry which I didn't understand. In the In reply to: 982008924 Refers to: src/Compilers/CSharp/Portable/Parser/Lexer.cs:767 in bb55d90. [](commit_id = bb55d90, deletion_comment = False) |
| _isVerbatim = isVerbatim; | ||
|
|
||
| _isVerbatim = (lexer.TextWindow.PeekChar(0) == '$' && lexer.TextWindow.PeekChar(1) == '@') || | ||
| (lexer.TextWindow.PeekChar(0) == '@' && lexer.TextWindow.PeekChar(1) == '$'); |
There was a problem hiding this comment.
Too bad we can't do something like is ['$', '@', ..] or ['@', '$', ..], because we probably don't want to pull on the length here, but it would look really cool ;-) #Closed
Yup. it's asymmetric. I'm happy to try to unify those in followup as well. |
Is it a bug, or just some redundant code? |
| using var tempLexer = new Lexer(SourceText.From(originalText), this.Options, allowPreprocessorDirectives: false); | ||
| var info = default(Lexer.TokenInfo); | ||
| tempLexer.ScanInterpolatedStringLiteralTop(interpolations, isVerbatim, ref info, out error, out closeQuoteMissing); | ||
| tempLexer.ScanInterpolatedStringLiteralTop(ref info, out error, out openQuoteRange, interpolations, out closeQuoteRange); |
There was a problem hiding this comment.
Is this the only thing that this local function needs from the surrounding context? If so, consider making the function static and passing this in. It's far enough away from the usage that it's not immediately obvious that interpolations is used in here when reading the main method body, and since the builder is immediately freed there's some refactoring risk there.
There was a problem hiding this comment.
yeah, i wasn't sure the best structure here. i've changed it uip a bit now to hopefully make it a bit clearer. now the data that is in/out should be more obvious/relevant (i think).
| bool closeQuoteMissing; | ||
| using (var tempLexer = new Lexer(Text.SourceText.From(originalText), this.Options, allowPreprocessorDirectives: false)) | ||
|
|
||
| rescanInterpolation(out var openQuoteRange, out var error, out var closeQuoteRange); |
There was a problem hiding this comment.
we can. though we'll also horrifically crash if they're not since we slice using those ranges. so teh assertion won't be doing much more than just precrashing :)
i believe the latter. |
|
Done review pass (commit 25). Just a couple of minor comments. |
Ok, fixed that :) was needed in the past, but not anymore with lots of the refactoring that have gone on here :) |
|
Ok. I'm going to try this as a squash. :) |
…rovements * upstream/main: (310 commits) Read SourceLink info and call service to retrieve source from there (dotnet#57978) Add new parser/lexer to the StackTraceAnalyzer (dotnet#57598) (dotnet#58050) Snap 17.1 P2 (dotnet#58041) Make it possible to analyze the dataflow of `ConstructorInitializerSyntax` and `PrimaryConstructorBaseTypeSyntax` (dotnet#57576) Shorten paths in VS installation (dotnet#57726) Add comments Add new parser/lexer to the StackTraceAnalyzer (dotnet#57598) Fix await completion for expression body lambda Add tests Fix comment Honor option, and also improve formatting with comment Skip TestLargeStringConcatenation (dotnet#58035) Log runtime framework of remote host Mark EqualityContract property accessor as not auto-implemented (dotnet#57917) Fix typo in XML doc for GeneratorExtensions (dotnet#58020) Hold Receiver directly in bound node for implicit indexer access (dotnet#58009) Pass AnalysisKind instead of int Enable nullable reference types for TableDataSource Simplify 'interpolation' data, and move to an easier to consume System.Range approach for it (dotnet#57966) Add missing test for CallerArgumentExpression (dotnet#57805) ...
Followup to #57945. This changes the interpolation struct to be simpler and not contain information that can be computed when the interpolation is needed.
It also moves the internal data to be in terms of Ranges instead of positions for things like where the
{and}are in an interpolation. This is needed for raw-interpolated-strings as those braces may be more than one character long. Usage of spans also makes length/substring computation trivial. Prior to this we had helpers that took in two positions but which was inclusive on both. This lead to a lot of complex math and didn't follow any of the slicing math patterns we use everywhere else in roslyn. Switching to Range just makes all this complexity fall out.