Accept all existing Greek letters using unicode characters in math mode by k4b7 · Pull Request #410 · KaTeX/KaTeX

k4b7 · 2015-12-05T23:57:35Z

Test Plan:

make test
run screenshot tests on travis-ci

wchargin · 2015-12-09T05:46:25Z

src/Lexer.js

\u03D6 is \varpi ϖ, \u03D5 is \phi ϕ, \u03F5 is \epsilon ϵ. KaTeX outputs all these (try \varpi \phi \epsilon and look at the Unicode code points of the output); do you want to include them?

And \u03D1 is \vartheta ϑ, \u03F1 is \varrho ϱ, also supported.

May as well add them since we have those glyphs. Good catch.

ronkok · 2015-12-26T17:08:37Z

Similarly, U+03B5 ↦ \varepsilon ε, U+03F0 ↦ \varkappa ϰ, U+03C2 ↦ \varsigma ς, U+03C6 ↦ \varphi φ. KaTeX supports all of them.

I would include \digamma, but there is an issue to resolve first.

The AMS (and KaTeX) function \digamma looks like U+03DC, capital letter Ϝ. But the unicode-math and Teubner packages map \digamma to U+03DD, small letter ϝ. It is \Digamma that they map to U+03DC, capital letter Ϝ.

It’s a problem. The unicode-math function names are consistent with the naming convention for other Greek letters. But AMS is more popular. I don’t know the best way to resolve the collision.

k4b7 · 2015-12-26T18:16:16Z

@ronkok your comment made me realize that I hadn't updated the token regex. I have added the glyphs for \varepsilon, \varsigma, and \varphi. I will add \varkappa and the AMS version of \digamma. I don't believe any of our fonts includes a glyph for the small letter ϝ.

wchargin · 2015-12-26T20:09:37Z

@ronkok Good catch about U+03F0 \varkappa, but the other three (U+03B5, U+03C2,
U+03C6) are handled by [\u03B1-\u03C9], right? (The ε, ς, and φ characters are the "normal" characters output by pressing e, w (for "word-final sigma"), or f (respectively) on a Greek keyboard.) Do they need any additional special casing?

k4b7 · 2015-12-26T21:06:02Z

After thinking of about the \digamma issue, I'm going to punt on it for now because at some point in the future we probably want to support more of the unicode-math package and I'd like to avoid changing the behavior.

ronkok · 2015-12-27T03:15:17Z

@wchargin, you are correct about U+03B5, U+03C2, and U+03C6. They do not require any special casing. My bad.

@kevinbarabash, Good call. It's okay to go slow. I don't hear the world calling out for a expedited decision on \digamma.

k4b7 · 2015-12-31T00:26:06Z

@wchargin I misread my own regex. This looks like it's good to go.

wchargin · 2015-12-31T00:40:58Z

Haha, okay :) The regex seems good to me except for the bizarre case of U+03A2, which is not a valid Unicode character for some reason. This seemed weird to me, but I confirmed it in my official Unicode Consortium book and it is correct. For reference, here's the full Greek table:

(The next page, which has all the codepoints' official names and decompositions, lists it as "reserved.")

Other than that, the characters that match the regex are ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨαβγδεζηθικλμνξοπρςστυφχψ, which seems fine to me.

# Python 3
''.join(chr(x) for x in list(range(0x0391, 0x03A9)) + list(range(0x03B1, 0x03C9)))

(I'm definitely not qualified to review the actual KaTeX aspect of it though, sorry 😕 )

k4b7 · 2015-12-31T00:52:20Z

If it's reserved it's probably going to be hard for a person to type it in. There are some characters (capitals letters) which we don't have glyphs for. I'll see what happens when they're typed in and report back.

k4b7 · 2015-12-31T01:06:08Z

It throws on all of the Greek glyphs that we don't have, e.g. Α, Β, Ε, etc. which I think is acceptable.

wchargin · 2015-12-31T01:16:11Z

Yeah, that's a tricky one. It always bothered me that TeX has no \Alpha command (whatever happened to semantics matter?). I mean, the two obvious choices that I see are disallow entering Α (U+0391 GREEK CAPITAL LETTER ALPHA) or alias it to A (U+0041 LATIN CAPITAL LETTER A). If we're ever planning to add support for user-defined fonts, for which the two glyphs could be distinct, that might make a difference.

In the spirit of forward-compatibility, your decision to disallow them for now sounds good to me; you can always add them later, but they'd be harder to remove. @xymostech ?

ronkok · 2016-01-02T16:40:54Z

Should \mathbf{Ω} render the same as \mathbf{\Omega}? Would a test of that be worthwhile?

k4b7 · 2016-01-02T21:56:15Z

@ronkok definitely worth a test. I'll see to it Monday.

xymostech · 2016-04-18T18:53:29Z

@kevinbarabash The code of this looks good! Do you want to write the test that @ronkok suggested? Also, what happens when you put these in \text{}?

k4b7 · 2016-04-18T18:55:29Z

@xymostech will do. I totally forgot about that test. Thanks for reminding me.

k4b7 · 2017-01-15T04:38:35Z

@xymostech I finally got around to testing \text{} and it blew up b/c I hadn't defined the symbols for text mode. I've updated the diff to include text mode versions of all of these symbols.

k4b7 · 2017-01-15T04:43:54Z

@ronkok sorry for the delay. I just tested \mathbf{Ω}\mathbf{\Omega} and they render the same. I'm going to update the screenshot test to include this.

edemaine · 2017-06-15T18:37:16Z

Just bumped into this older PR. Looks like everything was good to go, and just needs a rebase and review?

k4b7 · 2017-06-16T13:05:42Z

I'll rebase this this evening.

edemaine · 2017-06-16T13:58:57Z

I did more reading of the various unicode PRs. I'm a little confused about what the latest/best approach is, but it does make me wonder whether this one will be necessary... Intuitively, defineSymbol ought to define both the unicode and \command version of a symbol. But the other work shows that the font's notion of which unicode symbol it is is not always what we want.

Perhaps we should add another option argument to defineSymbol that gives the unicode character that should be recognized as this one? And then go through symbols by hand? Greek could still be the initial testing ground.

k4b7 · 2017-08-12T23:52:07Z

src/symbols.js

+defineSymbol(math, main, mathord, "\u03bc", "\\mu", true);
+defineSymbol(math, main, mathord, "\u03bd", "\\nu", true);
+defineSymbol(math, main, mathord, "\u03be", "\\xi", true);
+defineSymbol(math, main, mathord, "\u03bf", "\\omicron", true);


We have the omicron glyph in the our fonts so we may as well use it.

edemaine · 2017-08-13T14:20:57Z

src/symbols.js

+
+    if (acceptUnicodeChar) {
+        module.exports[mode][replace] = module.exports[mode][name];
+    }


Love this new approach! Avoids repetition of the Unicode symbol, and makes the decision of "include this unicode character" clear. I think if we ever want a unicode symbol to point to one not matching the font, then we can manually define that symbol.

This should extend pretty easily to other unicode characters that are simple matches.

edemaine · 2017-08-13T14:22:59Z

It looks like there's an error with the screenshot. Can you regenerate?

Test Plan: - make test - run screenshot tests on travis-ci Reviewers: emily

* Support Unicode relations This is the first in a series of PRs to give KaTeX the ability to recognize Unicode character input. The code in this PR follows the style of PR #410. All the characters in this PR will produce rel atoms. I’ll submit PRs for other atom types later. * Fix lint error. * Correct mapping errors This commit fixes a brain cramp of mine.

k4b7 added the GH Review: review-needed label Dec 5, 2015

k4b7 mentioned this pull request Dec 5, 2015

Support unicode in text #15

Closed

wchargin reviewed Dec 9, 2015
View reviewed changes

k4b7 added GH Review: needs-revision and removed GH Review: review-needed labels Dec 9, 2015

k4b7 force-pushed the greek_unicode_support branch from bf8c8ad to d2ae5ed Compare December 23, 2015 01:57

k4b7 added GH Review: review-needed and removed GH Review: needs-revision labels Dec 23, 2015

k4b7 added the GH Review: needs-revision label Dec 26, 2015

k4b7 removed the GH Review: review-needed label Dec 26, 2015

k4b7 added GH Review: review-needed and removed GH Review: needs-revision labels Dec 31, 2015

xymostech added GH review: accepted and removed GH Review: review-needed labels Apr 18, 2016

k4b7 self-assigned this Nov 1, 2016

k4b7 force-pushed the greek_unicode_support branch from d2ae5ed to f25a3b6 Compare January 15, 2017 04:36

k4b7 force-pushed the greek_unicode_support branch from f25a3b6 to 13fc10b Compare January 15, 2017 05:03

k4b7 added GH Review: review-needed and removed GH review: accepted labels Jan 15, 2017

k4b7 requested a review from xymostech January 20, 2017 01:42

k4b7 requested review from edemaine and removed request for xymostech June 16, 2017 13:04

k4b7 added GH Review: needs-revision and removed GH Review: review-needed labels Aug 11, 2017

k4b7 force-pushed the greek_unicode_support branch from 13fc10b to b2a757b Compare August 12, 2017 23:50

k4b7 commented Aug 12, 2017

View reviewed changes

k4b7 added GH Review: review-needed and removed GH Review: needs-revision labels Aug 12, 2017

edemaine approved these changes Aug 13, 2017

View reviewed changes

Accept all existing Greek letters using unicode characters in math mode

8150258

Test Plan: - make test - run screenshot tests on travis-ci Reviewers: emily

k4b7 force-pushed the greek_unicode_support branch from b2a757b to 8150258 Compare August 14, 2017 04:02

k4b7 merged commit e00738d into master Aug 14, 2017

k4b7 deleted the greek_unicode_support branch August 23, 2017 01:47

ronkok mentioned this pull request Sep 30, 2017

Add unicode symbols to symbols table #261

Closed

ronkok mentioned this pull request Oct 14, 2017

Support Unicode relations #933

Merged

ronkok mentioned this pull request Jun 5, 2018

[plugin system] Add a utility function (setFontMetrics) to extend builtin fontMetrics #1269

Merged

Uh oh!

Conversation

k4b7 commented Dec 5, 2015

Uh oh!

wchargin Dec 9, 2015

Choose a reason for hiding this comment

Uh oh!

wchargin Dec 9, 2015

Choose a reason for hiding this comment

Uh oh!

k4b7 Dec 9, 2015

Choose a reason for hiding this comment

Uh oh!

ronkok commented Dec 26, 2015

Uh oh!

k4b7 commented Dec 26, 2015

Uh oh!

wchargin commented Dec 26, 2015

Uh oh!

k4b7 commented Dec 26, 2015

Uh oh!

ronkok commented Dec 27, 2015

Uh oh!

k4b7 commented Dec 31, 2015

Uh oh!

wchargin commented Dec 31, 2015

Uh oh!

k4b7 commented Dec 31, 2015

Uh oh!

k4b7 commented Dec 31, 2015

Uh oh!

wchargin commented Dec 31, 2015

Uh oh!

ronkok commented Jan 2, 2016

Uh oh!

k4b7 commented Jan 2, 2016

Uh oh!

xymostech commented Apr 18, 2016

Uh oh!

k4b7 commented Apr 18, 2016

Uh oh!

k4b7 commented Jan 15, 2017

Uh oh!

k4b7 commented Jan 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edemaine commented Jun 15, 2017

Uh oh!

k4b7 commented Jun 16, 2017

Uh oh!

edemaine commented Jun 16, 2017

Uh oh!

k4b7 Aug 12, 2017

Choose a reason for hiding this comment

Uh oh!

edemaine Aug 13, 2017

Choose a reason for hiding this comment

Uh oh!

k4b7 Aug 14, 2017

Choose a reason for hiding this comment

Uh oh!

edemaine commented Aug 13, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

k4b7 commented Jan 15, 2017 •

edited

Loading