Skip to content

unicodeTextInMathMode setting#1117

Merged
k4b7 merged 7 commits intoKaTeX:masterfrom
edemaine:textinmath
Feb 20, 2018
Merged

unicodeTextInMathMode setting#1117
k4b7 merged 7 commits intoKaTeX:masterfrom
edemaine:textinmath

Conversation

@edemaine
Copy link
Member

@edemaine edemaine commented Jan 30, 2018

Fixes #1046

  • When unicodeTextInMathMode is true, accented letters from
    unicodeSymbols.js, and CJK and other supported languages,
    get added support in math mode (as requested in "KaTeX parse error: Expected 'EOF', got '<bla>' at position 1" when input pure Chinese character(s) #895).
  • When unicodeTextInMathMode is false, all of these stop working in
    math mode, and are only supported in text mode (matching XeTeX behavior).
    Note that this is a backwards incompatibility with some 0.9.0 alpha/betas, but it brings us back into alignment with LaTeX/XeTeX, so seems worth doing (?).

I'm open to better/shorter names for the option name, or for an argument why the default should be true instead of false (more permissive?).

* When `unicodeTextInMathMode` is `true`, accented letters from
  `unicodeSymbols.js`, and CJK and other supported languages,
  get added support in math mode (as requested in KaTeX#895).
* When `unicodeTextInMathMode` is `false, all of these stop working in
  math mode, and are only supported in text mode (matching XeTeX behavior).
  Note that this is a backwards incompatibility with some 0.9.0 alpha/betas.
@k4b7
Copy link
Member

k4b7 commented Jan 30, 2018

@edemaine I thought that unicode chars currently don't work in math mode.

@ylemkimon
Copy link
Member

https://github.com/Khan/KaTeX/blob/4a87f38e1a9f74883723a0aebce2fb91f1464863/src/symbols.js#L722-L736
Should they be allowed in the math mode when unicodeTextInMathMode == false?

@edemaine
Copy link
Member Author

edemaine commented Jan 31, 2018

@kevinbarabash é and other accented Unicode characters currently work in math mode, and this PR breaks that (to match XeTeX behavior).

@ylemkimon Hmm, probably not... Really, they shouldn't be supported at all right now, because they're not in the fonts. But maybe we can try to predict the future where they are in the fonts. I don't think we'll ever support Åå in math mode, for example, because math mode just doesn't have that accent (in LaTeX). So maybe we should just fully drop support for them now (in math mode only)? Scratch that, now we have \mathring! (#1125)

Çç and the other characters ought to be supported (by the font) sometime soon, but is blocked on https://github.com/KaTeX/katex-fonts/issues/2 and https://github.com/KaTeX/katex-fonts/issues/25 . I tested in XeTeX, and $ÇÐÞçðþ$ renders ð for some reason but none of the others. So we could keep these in the text symbol list, but drop them from the math symbol list. I'd personally be a fan of just dropping them altogether from math support. Or we could manually somehow add math support when unicodeTextInMathMode === true, I guess. Votes?

@k4b7
Copy link
Member

k4b7 commented Jan 31, 2018

@edemaine got it. I forgot whether we had already made that change or not. I guess not.

@ylemkimon
Copy link
Member

I think extraLatinMath can be removed as it's in unicodeSymbols.js and #1125 was merged.

@k4b7
Copy link
Member

k4b7 commented Feb 11, 2018

I want to get #1143 into to 0.9.0 final first. Then we can start 0.10.0-alpha with this.

@edemaine
Copy link
Member Author

@kevinbarabash That's fine, though personally I'd lean toward reducing the number of backward-incompatible changes (with non-alpha/beta changes). I'll revise this PR today for consideration, but also fine to wait on it.

Personally I actually prefer the current KaTeX behavior of "just working" with Unicode. It's a shame that LaTeX can't do that without a lot of work. Maybe the default unicodeTextInMathMode should be true? Or maybe we can change it to that once we can provide a katex.sty file that does accent mapping for text characters...

* Fix double handling of ð (math maps to \eth, not special Unicode character)
* Remove Åå special math handling, thanks to KaTeX#1125
@edemaine
Copy link
Member Author

I removed the special handling of Å and å. (This should really have been done in #1125. Sorry!)

I also realized (thanks to some additional tests) that ð was being handled twice -- once in the \eth definition and once in extraLatin. I split this out (we still need an "extra" definition for ð for text mode, for now).

I think the commit for the above changes should be merged before 0.9.0 final, if possible, as they are bug fixes. I can split this into a separate PR if you prefer.

This PR is also ready to go. It now checks for extraLatin symbols and rejects them if they appear in math mode when unicodeTextInMathMode is false. When we eventually remove extraLatin, we can remove this check.

@codecov
Copy link

codecov bot commented Feb 12, 2018

Codecov Report

Merging #1117 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1117      +/-   ##
=========================================
- Coverage   79.61%   79.6%   -0.02%     
=========================================
  Files          59      59              
  Lines        3876    3879       +3     
  Branches      652     653       +1     
=========================================
+ Hits         3086    3088       +2     
- Misses        656     657       +1     
  Partials      134     134
Impacted Files Coverage Δ
src/Parser.js 92.89% <100%> (-0.23%) ⬇️
src/symbols.js 100% <100%> (ø) ⬆️
src/Settings.js 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7de91f7...e85d540. Read the comment docs.

@k4b7
Copy link
Member

k4b7 commented Feb 13, 2018

I think the commit for the above changes should be merged before 0.9.0 final, if possible, as they are bug fixes. I can split this into a separate PR if you prefer.

Splitting it out in a separate PR would be the easiest b/c of the other breaking changes in this diff.

@k4b7
Copy link
Member

k4b7 commented Feb 16, 2018

@edemaine love the "next version" tag. ^_^

@k4b7
Copy link
Member

k4b7 commented Feb 18, 2018

I've release v0.9.0 so we'll be able to merge this after review.

Copy link
Member

@k4b7 k4b7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I'm sure lots of people will be excited for this new setting.

@k4b7 k4b7 mentioned this pull request Feb 20, 2018
2 tasks
@k4b7 k4b7 merged commit aed1c1e into KaTeX:master Feb 20, 2018
@ylemkimon ylemkimon mentioned this pull request Aug 18, 2018
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants