Added an implementation independent test suite #1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "heuer/scfg:master"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
An attempt to create a test suite for scfg to test scfg deserializers independently and to have a common base to interpret scfg.
The test suite was created for a not yet Open Source project and is not tested against other implementations.
@ -0,0 +25,4 @@* All whitespace is normalized to a single space character.* All names and parameters must be enclosed in double quotes.* All unnecessary escapes are removed from names and parameters.* All directives of a block must be indented by 4 spaces per level of nesting.While it could be inferred by looking at the expected output (or at some failing tests), it might be a good idea to also note how to append the final newlines. In the case of your expected output, you always append a newline at the end of a directive (or the closing brace of its block). At the end of the most outer block, you add another newline.
Thanks for the feedback.
I tried to address your points by 6787555b6a584ed3f2bc49c93917b2d9391e6c72
Let me know if the rules need more polishing.
@ -0,0 +1,3 @@directive { # commentAs per the grammar, comments can only appear at the start of a line. Or is there some semi-authoritative example that suggests something else?
For example, the Go parser also only accepts comments at the beginning of a line:
case quote == 0 && len(words) == 0 && sb.Len() == 0 && ch == '#':The same goes for my Java parser.
Thanks, you're right. I'll move the test to the "invalid" directory :)
@ -0,0 +1 @@test { me }According to the grammar, this is actually a valid directive with the name "test" and the parameters "{", "me" and "}". I can't tell if this is intended though.
The test case was meant to include two errors:
How can the braces be interpreted as param? A directive-param is a "word" which is either an "atom", a "dquote-word", or a "squote-word". "dquote-word" and "squote-word" cannot be considered because of the missing quote characters.
What remains is "atom", which is either an ACHAR or an esc-pair. ACHAR excludes "{" and "}"
test \{ me \}would be a directive named "test" with three params: "{", "me", "}"
Indeed, you're right. Not sure what I've read there... 😅 Sorry!
Thanks for the patch, this definitely sounds like a good idea!
I've been wondering whether we should use a different format for the "expected" files, e.g. JSON. This would avoid having to encode scfg in a specific way (some encoder implementations may not always use double quotes), while still being easily readable from any implementation test suite. Or was the idea to test encoders as well?
Thanks for the feedback. I'm open to suggestions regarding the output format. I also considered defining instructions to output JSON, but ultimately rejected the idea because I felt it would add another artificial hurdle to the implementation.
Even though the output is a valid scfg document, it's not intended as a best practice suggestion. The only requirement was a standard format that's easy to implement and compare (perhaps just for the test cases).
Furthermore, it's easier for implementers to write scfg than JSON.
When I think of a test case, I first write the scfg to be tested, then the expected output (with the transformation rules applied), and then run the implementation through the test suite to see if it delivers the expected result.
I've added support for this in libscfg: emersion/libscfg#1
This worked pretty well overall.
comment_after_bracedoesn't pass there (parses fine but should not), but it seems like a bug in the implementation.@ -0,0 +23,4 @@* All comments are removed.* Extraneous whitespace (spaces, tabs, newlines) is removed* All whitespace is normalized to a single space character.* All names and parameters must be enclosed in double quotes.In libscfg, because of this rule I've had to add a special "test mode" flag to the formatter. I generally dislike adding flags changing the behavior of an implementation under tests.
An alternative would be to define a canonical scfg format which uses double quotes only if a parameter contains special characters. But this feels a bit arbitrary, and the rules can grow quite complex if we want to reduce the clutter. For instance, one could decide to use single quotes if a parameter's special characters are only double quotes, or even count the number of special characters to reduce the number of escapes.
The current approach (unconditional double quotes) seems simplest.
My current implementation of scfg doesn't support serialization at all. The output is only implemented in my tests. As I said, I'm open to better suggestions (JSON, XML, …) for a canonical format; this was just the simplest one I could think of.
Yeah, that makes sense. If I had to use JSON from libscfg, I would probably copy/paste the few lines of code required to encode JSON strings and hand-roll the encoder, resulting in code pretty similar to a scfg encoder. For languages with built-in JSON support, would be easier though.
Let's go with this approach - we can always convert the files if we decide we should switch to something else in the future.
@ -0,0 +28,4 @@* Opening braces `{` remain on the same line as their associated directive.* Closing braces `}` are on their own line, indented to match the indentationlevel of their opening directive* All directives of a block must be indented by 4 spaces per level of nesting.I don't want to rehash the decades-old argument about spaces vs tabs, but I personally use tabs everywhere. This repo's README also uses tabs.
No problem, although I prefer spaces, I will change the indentation level to one tab.
@ -0,0 +29,4 @@* Closing braces `}` are on their own line, indented to match the indentationlevel of their opening directive* All directives of a block must be indented by 4 spaces per level of nesting.* The output must end with a single trailing newline character (`\n`)Test files in this directory end with two final newline characters: one for the final directive, and one extra. I'm not entirely following the intent here: why the extra newline? All directives and closing braces are followed by a newline, why add another one?
To be honest: The final line break arose purely out of laziness, without any technical reason. When I first had the idea to implement a general test suite for scfg, the initial test cases had a final but unnecessary line break.
I then stuck with it for all subsequent test cases.
I will correct this.
LGTM, thanks!
Two more minor details:
@ -0,0 +1 @@"escape\t\r\n\nme"Maybe I missed something, but shouldn't this be
escapetrnnme?If you follow the grammar exactly, your result is correct. However, I thought the grammar meant that control characters should be encoded this way. What does libscfg say about the test?
I agree the grammar is not enough to explain semantics. We should add prose in plain English to describe semantics.
The original intent of the spec is that
\escapes any following character, just like POSIX shell does. That is,\is always dropped from the output.libscfg test result:
Thanks for the feedback, I'll update the test case (and fix my implementation ;)).
To encode control characters like
\tand\nsomeone has to use\\t\\netc., right?One could use single quotes to not interpret backslash:
'\n'is equivalent to"\\n".(This is a bit weird because it's halfway between POSIX shell and… something else: POSIX shell has
\ninterpretation for double quotes, but scfg doesn't.)Thanks. It's fixed in
9082787b6eand I added two more test (sq_backslash and sq_backslash2)32da49032fto7096cde4007096cde400to0d5910ebc9@emersion wrote in #1 (comment):
Fixed.
If the file "escape_seq.scfg" needs to be corrected (see Hannes' comment), I can do that at short notice.
0d5910ebc9to9082787b6eThank you!