Conversation
| ", expected '" + (delims && isDelimited ? delims[match] : "}") + | ||
| "'", tok); | ||
| } | ||
| if (delims && isDelimited) { |
There was a problem hiding this comment.
delims && is required for Flow to recognize delims != null.
src/Parser.js
Outdated
| if (isOptional) { | ||
|
|
||
| // \sqrt is a macro only if an optional argument exists | ||
| if (funcData.type === "sqrt" && i === 1 && optArgs[0] == null) { |
There was a problem hiding this comment.
I can't seem to find a better way to handle \sqrt than hard-coding it.
for non-strict mode and \genfrac, respectively.
|
This breaks the example at the main page: % \f is defined as f(#1) using the macro
\f{x} = \int_{-\infty}^\infty
\hat \f\xi\,e^{2 \pi i \xi x}
\,d\xiAny suggestion to replace it is welcome! |
Parser mode affects neither fetch nor consume.
Codecov Report
@@ Coverage Diff @@
## master #2085 +/- ##
==========================================
- Coverage 93.83% 93.81% -0.02%
==========================================
Files 84 84
Lines 6112 6128 +16
Branches 1254 1260 +6
==========================================
+ Hits 5735 5749 +14
- Misses 346 348 +2
Partials 31 31
Continue to review full report at Codecov.
|
|
@ylemkimon in #2288 you commented:
but earlier in this PR you also commented:
Looking at the code it looks delimiters are supported. Is that correct? If so, could you add some tests that demonstrate this? |
|
@kevinbarabash I meant it's not possible to define a delimited macro using |
| it("\\gdef defines macros with delimited parameter", function() { | ||
| expect`\gdef\foo|#1||{#1}\text{\foo| x y ||}`.toParseLike`\text{ x y }`; | ||
| expect`\gdef\foo#1|#2{#1+#2}\foo 1 2 |34`.toParseLike`12+34`; | ||
| expect`\gdef\foo#1#{#1}\foo1^{23}`.toParseLike`1^{23}`; | ||
| expect`\gdef\foo|{}\foo`.not.toParse(); | ||
| expect`\gdef\foo#1|{#1}\foo1`.not.toParse(); | ||
| expect`\gdef\foo#1|{#1}\foo1}|`.not.toParse(); |
There was a problem hiding this comment.
Tests for delimited macro are already added here.
There was a problem hiding this comment.
I was wondering what the | were for. I'm so used to brackets and parens as delimiters, I forgot that | is a delimiter as well. 🙃
k4b7
left a comment
There was a problem hiding this comment.
I've done a partial review. I'll pick up at def.js when I have some more time.
| ## `\newline` and `\cr` | ||
| `\newline` and `\cr` no longer takes an optional size argument. To specify vertical | ||
| spacing, `\\` should be used. |
There was a problem hiding this comment.
Does this change in behaviour arise from the macro/primitive changes?
There was a problem hiding this comment.
@kevinbarabash Partly, yes. Those changed space handling and they are updated to match their definitions.
There was a problem hiding this comment.
This is a late comment, sorry. But there is a breaking change in that \raisebox previously accepted unbraced dimensions and now it does not. That might be worth a note in the migration guide.
There was a problem hiding this comment.
I don't think there is anything to "fix". LaTeX \raisebox does not accept unbraced dimensions, so the new behavior is not a bug. It's just needs a note in the migration guide.
| tokens.shift(); | ||
| } | ||
| tokens.reverse(); // to fit in with stack order | ||
| return {tokens, start, end: tok}; |
There was a problem hiding this comment.
start and end can be outside the range of tokens, correct?
There was a problem hiding this comment.
@kevinbarabash No, they are always the first and last or the second and second to last tokens.
There was a problem hiding this comment.
If the case where there's a { and a } at the start and end of tokens are the start and end the braces and tokens the stuff inside the braces (after the pop() and shift())?
There was a problem hiding this comment.
@kevinbarabash Sorry, my mistake. end can be outside the range of tokens, when it's delimited. In that case, end is the last delimiter.
| const expression = this.parseExpression(false, "EOF"); | ||
| this.expect("EOF"); // expect the end of the argument |
There was a problem hiding this comment.
I wonder if we could use a different token to mark the end of arguments. Overloading EOF seems like it could be a bit confusing. Maybe something for a future PR given that there's already so much going on in this one.
There was a problem hiding this comment.
@kevinbarabash I agree. This is my major concern, too. This will probably cause a lot of issues that are hard to debug, but I can't seem to find a simple better way. I'll add TODO for now.
| export const normalizeArgument = function(arg: AnyParseNode): AnyParseNode { | ||
| return arg.type === "ordgroup" && arg.body.length === 1 ? arg.body[0] : arg; | ||
| }; |
There was a problem hiding this comment.
Is it possible that arg could be a series of nested ordgroups? If so, would we want to unwrap each of them?
There was a problem hiding this comment.
@kevinbarabash I'm not sure whether it's possible, but that's not caused by changes in this PR.
src/environments/array.js
Outdated
| // \cr is equivalent to \\ without the optional size argument (see below) | ||
| // TODO: provide helpful error when \cr is used outside the array environment |
There was a problem hiding this comment.
Can you raise a ticket for this TODO?
| // \def\Let@{\let\\\math@cr} | ||
| // \def\math@cr{...\math@cr@} | ||
| // \def\math@cr@{\new@ifnextchar[\math@cr@@{\math@cr@@[\z@]}} | ||
| // \def\math@cr@@[#1]{...\math@cr@@@...} | ||
| // \def\math@cr@@@{\cr} |
There was a problem hiding this comment.
Does this collapse multiple \crs in a row or something like that?
There was a problem hiding this comment.
@kevinbarabash No, they are just how \\ is defined in LaTeX and amsmath.
There was a problem hiding this comment.
I don't think I'll ever get used to reading .dtx files. 😓
k4b7
left a comment
There was a problem hiding this comment.
I didn't know that LaTeX treated primitives differently from macros. The summary you posted at the top of this PR was quite enlightening. Thank you for noticing this discrepancy and figuring out how to correct it. I've finished my first pass. I don't see any blocking issues, but I do have a fair number of questions about various things so I'm holding off on approving until you have a chance to respond to them.
| // same signature, we implement them as one megafunction, with newRow | ||
| // indicating whether we're in the \cr case, and newLine indicating whether | ||
| // to break the line in the \newline case. | ||
|
|
||
| // \DeclareRobustCommand\\{...\@xnewline} |
There was a problem hiding this comment.
@kevinbarabash They are omitted commands for clarity.
| // If the very last character of the <parameter text> is #, so that | ||
| // this # is immediately followed by {, TeX will behave as if the { | ||
| // had been inserted at the right end of both the parameter text | ||
| // and the replacement text. |
There was a problem hiding this comment.
An example of this situation would be helpful.
| } else if (tok.text === "EOF") { | ||
| throw new ParseError("Expected a macro definition"); | ||
| } else { | ||
| delimiters[numArgs].push(tok.text); |
There was a problem hiding this comment.
Can tok.text be any string or should we be checking it against a list of allowed delimiters?
There was a problem hiding this comment.
@kevinbarabash AFAIK, every valid TeX token can be used as delimiters.
| } else if (tok.text === "EOF") { | ||
| throw new ParseError("Expected a macro definition"); |
There was a problem hiding this comment.
@kevinbarabash If the end is reached without the macro body, e.g., \def\abc.
| "Unexpected end of input in a macro argument," + | ||
| " expected '}' at end of input: …r{#ffffff{text}"); |
There was a problem hiding this comment.
Removing the position information may make it hard to track down errors in some cases. The underlining was also helpful because it made it really easy to see the range of the issue. Looking at some of the test cases below, we still have both of these features in those test cases. Is it hard to maintain those features in these test cases because how macro expansion has changed?
There was a problem hiding this comment.
@kevinbarabash In case of macros, arguments should be scanned (fetched) first, so scanning errors are detected before parsing errors, hence changes in error position.
There was a problem hiding this comment.
I see, this is just how we've been reporting parse for some time now.
| expect`\textcolor{1a2}{foo}`.toFailWithParseError( | ||
| "Invalid color: '1a2'" + | ||
| " at position 12: \\textcolor{1̲a̲2̲}{foo}"); | ||
| " at position 11: \\textcolor{̲1̲a̲2̲}̲{foo}"); |
There was a problem hiding this comment.
Why is there an extra space inside of \textcolor{} in the error message?
There was a problem hiding this comment.
@kevinbarabash It is probably combining underline not shown properly.
There was a problem hiding this comment.
I don't see an extra space anymore. 🤷♂️
| describe("A document fragment", function() { | ||
| it("should have paddings applied inside an extensible arrow", function() { | ||
| const markup = katex.renderToString("\\tiny\\xrightarrow\\textcolor{red}{x}"); | ||
| expect(markup).toContain("x-arrow-pad"); | ||
| }); | ||
|
|
||
| it("should have paddings applied inside an enclose", function() { | ||
| const markup = katex.renderToString(r`\fbox\textcolor{red}{x}`); | ||
| expect(markup).toContain("boxpad"); | ||
| }); | ||
|
|
||
| it("should have paddings applied inside a square root", function() { | ||
| const markup = katex.renderToString(r`\sqrt\textcolor{red}{x}`); | ||
| expect(markup).toContain("padding-left"); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
@kevinbarabash They're test cases added in #1707, which relied on edge-case and previous (wrong) behaviors.
k4b7
left a comment
There was a problem hiding this comment.
Thanks for answering all my questions and for all of your hard work on this PR.
| tokens.shift(); | ||
| } | ||
| tokens.reverse(); // to fit in with stack order | ||
| return {tokens, start, end: tok}; |
There was a problem hiding this comment.
If the case where there's a { and a } at the start and end of tokens are the start and end the braces and tokens the stuff inside the braces (after the pop() and shift())?
| "need to increase maxExpand setting"); | ||
| } | ||
| let tokens = expansion.tokens; | ||
| const args = this.consumeArgs(expansion.numArgs, expansion.delimiters); |
There was a problem hiding this comment.
I didn't realize there could be multiple character delimiters. Thanks for the explanation.
| argToken.text = str; | ||
| return argToken; |
There was a problem hiding this comment.
If that's the case, maybe we should have an assertion that argToken.text.length === str.length and throw if that's not the case. The error reporting locality would be better than waiting for an EOF mismatch.
| // \def\Let@{\let\\\math@cr} | ||
| // \def\math@cr{...\math@cr@} | ||
| // \def\math@cr@{\new@ifnextchar[\math@cr@@{\math@cr@@[\z@]}} | ||
| // \def\math@cr@@[#1]{...\math@cr@@@...} | ||
| // \def\math@cr@@@{\cr} |
There was a problem hiding this comment.
I don't think I'll ever get used to reading .dtx files. 😓
| } else if (tok.text === "EOF") { | ||
| throw new ParseError("Expected a macro definition"); |
| "Unexpected end of input in a macro argument," + | ||
| " expected '}' at end of input: …r{#ffffff{text}"); |
There was a problem hiding this comment.
I see, this is just how we've been reporting parse for some time now.
| expect`\textcolor{1a2}{foo}`.toFailWithParseError( | ||
| "Invalid color: '1a2'" + | ||
| " at position 12: \\textcolor{1̲a̲2̲}{foo}"); | ||
| " at position 11: \\textcolor{̲1̲a̲2̲}̲{foo}"); |
There was a problem hiding this comment.
I don't see an extra space anymore. 🤷♂️
|
@kevinbarabash Thank you for the review! |
introduced in #2085: https://github.com/KaTeX/KaTeX/pull/2085/files#diff-d351bc0c29817ef51ebb3022b15c0704L67 Co-authored-by: ylemkimon <y@ylem.kim>
1. Primitive v.s. Macro
LaTeX has two types of command (control sequence): TeX primitive and macro. TeX primitives are provided by TeX engine. Macros, in contrast, are defined using primitives in the LaTeX kernel (
latex.ltx) and packages such asamsmath. (Macro may expand into primitive commands without taking arguments. They are considered as primitives hereinafter.) Primitives scan argument as it goes, i.e., expanding macros, whereas macros take a properly nested group or non-blank token without expanding. For instance,is equivalent to
However, KaTeX uses primitive argument parsing for functions. This PR introduces a new argument type
primitive, which uses the existing method, and uses it for parsing arguments of primitive commands. It also changes other group parsers such asparseArgumentGroup(new) andparseStringGroupto use macro argument parsing method, which scans the argument and denotes the end of the argument at theMacroExpanderlevel before parsing. This fixes #924, fixes #1027, fixes #1720, and fixes #2319. Closes #611.2. Macro
Delimited parameters in macro and brace handling are implemented in the new
consumeArgfunction. Braced control sequence and unbraced replacement text no longer works.[1]Optional arguments are implemented using a delimited parameter at the
MacroExpanderlevel. This fixes #1419 and #2288.Whitespace handling is also improved to consume only in undelimited macro arguments and required/optional arguments. This broke
\\and this PR fixes it along with fix #2119.Some changes may result in BREAKING CHANGEs such as [1], and issues linked above, however they probably were originally wrong behavior.