Add raw string group, move comment parsing to Parser, change URL group parser#1711
Add raw string group, move comment parsing to Parser, change URL group parser#1711k4b7 merged 6 commits intoKaTeX:masterfrom ylemkimon:raw-string-group
raw string group, move comment parsing to Parser, change URL group parser#1711Conversation
Codecov Report
@@ Coverage Diff @@
## master #1711 +/- ##
==========================================
- Coverage 93.89% 93.89% -0.01%
==========================================
Files 78 78
Lines 4569 4585 +16
Branches 805 811 +6
==========================================
+ Hits 4290 4305 +15
Misses 246 246
- Partials 33 34 +1
Continue to review full report at Codecov.
|
|
I applaud this PR. I also think it will be possible to recover use of "%" as a URL escape character. I plan to write a PR which does that after this PR lands. My general intent is to:
|
|
@ronkok Nice idea! If you don't mind, I'd like give it a try.
Instead of throwing an error, the LaTeX behavior seems to be continue parsing to the next line. |
Please, feel free.
I haven't written any of this idea into code yet. But I want to keep the behavior that @edemaine created for comments. So in an expression like |
raw string group, change URL group parserraw string group, move comment parsing to Parser, change URL group parser
|
@ronkok Thank you for the suggestion! I’ve implemented it and it have made code much simpler. |
k4b7
left a comment
There was a problem hiding this comment.
Requesting more tests for the new raw arg type.
| firstToken.range(lastToken, str)); | ||
| case "%": | ||
| if (!raw) { // allow % in raw string group | ||
| this.consumeComment(); |
There was a problem hiding this comment.
I guess that's kind of hard to do that lever level.
There was a problem hiding this comment.
@kevinbarabash Yes, it's context-dependent.
| this.expect(groupBegin); | ||
| let str = ""; | ||
| const firstToken = this.nextToken; | ||
| let nested = 0; // allow nested braces in raw string group |
There was a problem hiding this comment.
If it's truly "raw" we might want to allow mismatched braces. In practice that's probably not all that useful so maybe add a comment describing the limitations of raw arg types.
There was a problem hiding this comment.
@kevinbarabash If we allow unmatched braces, it's impossible to determine where the group ends, without forbidding a right brace.
| if (optional && this.nextToken.text !== "[") { | ||
| return null; | ||
| const groupBegin = optional ? "[" : "{"; | ||
| const groupEnd = optional ? "]" : "}"; |
| consumeComment() { | ||
| // the newline character is normalized in Lexer, check original source | ||
| while (this.nextToken.text !== "EOF" && this.nextToken.loc && | ||
| this.nextToken.loc.getSource().indexOf("\n") === -1) { |
There was a problem hiding this comment.
Why not this.nextToken.text.indexOf("\n") === -1?
There was a problem hiding this comment.
@kevinbarabash Lexer normalizes whitespaces to a single space( ). I think it's better to look at the source, instead of separating whitespace token and newline token.
There was a problem hiding this comment.
That makes sense. It might be worth adding a comment here about that.
| // argument is parsed normally) | ||
| // - Mode: Node group parsed in given mode. | ||
| export type ArgType = "color" | "size" | "url" | "original" | Mode; | ||
| export type ArgType = "color" | "size" | "url" | "raw" | "original" | Mode; |
There was a problem hiding this comment.
Now that we have raw, I wonder if we should move the parsing of color, size, and url into functions/*.js.
| expect("\\kern{1 %kern\nem}").toParse(); | ||
| expect("\\kern1 %kern\nem").toParse(); | ||
| expect("\\color{#f00%red\n}").toParse(); | ||
| }); |
There was a problem hiding this comment.
We should probably have some tests for parsing raw arg types. It would be cool if we had a test that defines a function that accepts one raw arg and one optional raw arg then we could test that including [] or {} in the raw string works as expected.
There was a problem hiding this comment.
@kevinbarabash Currently, it seems there is no way to define a function cleanly in the tests. I've updated existing test case below.
There was a problem hiding this comment.
Maybe that's something we can do in the future. I've opened an issue for it #1740.
|
|
||
| it("should allow balanced braces in url", function() { | ||
| const url = "http://example.org/{too}"; | ||
| const url = "http://example.org/{{}t{oo}}"; |
k4b7
left a comment
There was a problem hiding this comment.
Sorry for the delay. Looks great!
This adds a
rawstring group parser (parseStringGroupwith the last argumenttrue), which returns the raw string inside a group and allows:\command x,\command#\command{a={[b,{c}],d}}This also moves comment parsing to
Parser, which makes possible to allow percent sign at theParserlevel and contextless (settingless)Lexer.Furthermore, this changes URL parser to use raw string group, which makes
LexerandParsermuch simpler, as it doesn't require defining special cases and duplicating function parsing code