Skip to content

All the regex parsing tests in their own branch.#25520

Merged
jcouv merged 63 commits intodotnet:features/embeddedRegexfrom
CyrusNajmabadi:regexTests
Aug 23, 2018
Merged

All the regex parsing tests in their own branch.#25520
jcouv merged 63 commits intodotnet:features/embeddedRegexfrom
CyrusNajmabadi:regexTests

Conversation

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor

Followup to #23984. I've pulled out the tests to make that PR easier to review.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner March 16, 2018 06:24
@CyrusNajmabadi CyrusNajmabadi changed the title All the regex tests in their own PR. All the regex parsing tests in their own branch. Mar 16, 2018
@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@dotnet-bot retest windows_debug_vs-integration_prtest please

@CyrusNajmabadi CyrusNajmabadi changed the base branch from master to features/embeddedRegex April 2, 2018 19:03
Assert.True(regex.GetGroupNames().Where(v => !int.TryParse(v, out _)).OrderBy(v => v).SequenceEqual(
tree.CaptureNamesToSpan.Keys.OrderBy(v => v)));
}
catch (IndexOutOfRangeException)
Copy link
Copy Markdown
Member

@jcouv jcouv Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try/catching seems too loose. Can tests that trigger those cases pass a parameter to signal their expectation? #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. i can do that. #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I did this. Tests now need to explicitly state if they expect hte regex constructor to throw. Also, we only wrap creating the .net regex in a try/catch. That way we're only checking for exceptions in it, and we're not accidentally catching exceptions in the roslyn regex system. #Closed

Assert.Null(tree);
}
}
}
Copy link
Copy Markdown
Member

@jcouv jcouv Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed it, but do we check round-tripping? #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. But that's a good idea. I will add that to the tests! #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nevermind. I do that test that. Those are the CheckInvariants and CheckCharacters calls that we make. #Closed

Copy link
Copy Markdown
Member

@jcouv jcouv Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking just a ToString() call on RegexTree and comparing with the known string. #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to not create a ToString (yet) that isn't used anywhere but tests. We already enforce hte invariants here through the supported API. #Closed

}

[Collection(nameof(MyCollection))]
public partial class CSharpRegexParserTests
Copy link
Copy Markdown
Member

@jcouv jcouv Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSharpRegexParserTests [](start = 25, length = 22)

From reading the doc on Collection, I expected to find some tests (annotated with [Fact]) in this class.
Is this meant to only be run manually? #Closed

Copy link
Copy Markdown
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. This was all done manyally for generatin tests. (hence _TestGeneration). #Closed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It seems useful to keep this file, but please add some documentation.


In reply to: 212121143 [](ancestors = 212121143)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this code already provides a way of regenerating the baselines. What do you think of better representing spans?


In reply to: 212122968 [](ancestors = 212122968,212121143)

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

<Capture Name=""0"" Span=""[10..42)"" />

Such spans are not very useful (reviewer cannot verify). I would have recommended quoting the text of that span... :-S #Closed


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:166 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

<Capture Name=""0"" Span=""[10..42)"" />

To update so many test baselines, Fred has a tool which could be useful (with some adaptation).
https://github.com/333fred/IOperationTestFixer

The basic flow:

  • run Pilchie's test runner and copy the test errors in a file
  • run the fixer (it will parse the test errors and collect "Actual" strings with corresponding test names, then it will fix the tests)
    The fixer would need to be adapted, as it currently only understands the pattern of tests used for IOperations.

I feel pretty strongly about doing something to improve the readability of these spans, but I'd be ok if it's done separately.


In reply to: 415178848 [](ancestors = 415178848)


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:166 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

<Capture Name=""0"" Span=""[10..42)"" />

Another idea would be to fix the test framework so that any test that is added or updated from now on would include more details, but all those initial tests would be left as-is.


In reply to: 415182382 [](ancestors = 415182382,415178848)


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:166 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

        Test(@"@""((?<One>abc)\d+)?(?<Two>xyz)(.*)""", @"<Tree>

Is it possible to use digits for capture names? For instance, 0 or 1?
What happens in such case? #Closed


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:105 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

public partial class CSharpRegexParserTests

In general, I didn't understand the separation of tests: "Basic", "DotnetNegative" and "Reference". Could you clarify?
If the categorization isn't quite clear, I'd be tempted to merge the three files. #Closed


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:8 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

Copy link
Copy Markdown
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with review pass (iteration 54).

I think something should be done about validation of spans in captures. We can discuss some options.

I need some overview for "TestGeneration" file, so that I can effectively review it.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Another idea would be to fix the test framework so that any test that is added or updated from now on would include more details, but all those initial tests would be left as-is.

I would be ok with that. :)

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Is it possible to use digits for capture names? For instance, 0 or 1?

Yes. You can see that in ReferenceTest123.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

In general, I didn't understand the separation of tests: "Basic", "DotnetNegative" and "Reference". Could you clarify?

I can comment.

"basic" is me going through every codepath in the lexer/parser and trying to hit it.
"dotnetnegative" are the existing tests from corefx that all fail.
"reference" is taking all the strings from https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference and making tests out of all of them.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

I need some overview for "TestGeneration" file, so that I can effectively review it.

This file was used by me for some times when i had to go update like a thousand tests at once. It was easier to basically run the existing tests through a helper funciton that then spat out the new test baselines. I don't really need it anymore and it can be removed. For anything that has spans right now (for example Captures, or Diagnostics), i'm happy to make it include text. In that regard, TestGeneration would likely be useful to respit out all the test baselines again.

Copy link
Copy Markdown
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (iteration 58).

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Given that this code already provides a way of regenerating the baselines. What do you think of better representing spans?

SGTM. I'll do so tonight. I'll leave the spans in, but i'll also give the text those spans represent. Does that work for you?

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

but i'll also give the text those spans represent.

Yes, that sounds good. Thanks!

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@jcouv I've updated all the baselines (yaay for having a tool to do that) :) Let me know what you think.

@jcouv
Copy link
Copy Markdown
Member

jcouv commented Aug 22, 2018

<Capture Name=""0"" Span=""[10..42)"" />

Awesome. That's much better!


In reply to: 415183243 [](ancestors = 415183243,415182382,415178848)


Refers to: src/Workspaces/CSharpTest/EmbeddedLanguages/RegularExpressions/CSharpRegexParserTests_ReferenceTests.cs:166 in 7767ce6. [](commit_id = 7767ce6, deletion_comment = False)

Copy link
Copy Markdown
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (iteration 63).
Thanks a lot for addressing the span issue!

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Awesome. That's much better!

Also, a super great reason to keep around the baseline regenerator :D

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@jcouv can you merge in?

@jcouv jcouv merged commit 8a731ba into dotnet:features/embeddedRegex Aug 23, 2018
@CyrusNajmabadi CyrusNajmabadi deleted the regexTests branch January 25, 2020 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-IDE Community The pull request was submitted by a contributor who is not a Microsoft employee. Test Test failures in roslyn-CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants