Unicode accents by edemaine · Pull Request #992 · KaTeX/KaTeX

edemaine · 2017-11-25T21:25:00Z

This PR fixes #801.

Lexer now looks for combining dicritical marks and adds them to the same character
Parser's parseSymbol now recognizes both combined and uncombined forms of Unicode accents, and builds accent objects just like the accent functions
~~Added CJK support to math mode (not just text mode). "KaTeX parse error: Expected 'EOF', got '<bla>' at position 1" when input pure Chinese character(s) #895.~~

k4b7

Nice start... requesting more tests and polyfill String.prototype.normalize.

k4b7 · 2017-11-25T23:31:09Z

test/unicode-spec.js

    });

-    it("should parse Latin-1 outside \\text{}", function() {
-        expect('ÀàÇçÉéÏïÖöÛû').toParse();


Can we keep these tests but only check for Çç? Eventually we'll want to test for œ, Œ, æ, Æ, and ß when they get added.

This PR actually removes support for Çç. I believe they were added prematurely when forcing support for Latin-1. I just verified that none of the Latin-1 glyphs, Çç included, are in the KaTeX fonts. (I believe this actually makes sense, and the purpose of this PR is to map them into the accent macros, like LaTeX would.) Here's an example of incorrect looking Çç vs Cc in the current KaTeX:

I opened https://github.com/KaTeX/katex-fonts/issues/2 to track the cedilla issue.

k4b7 · 2017-11-25T23:34:43Z

test/katex-spec.js

+        expect("ÀÁÂÃÄÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝàáâãäèéêëìíîïñòóôõöùúûüýÿ")
+        .toParseLike(
+            "\\grave A\\acute A\\hat A\\tilde A\\ddot A" +
+            "\\grave E\\acute E\\hat E\\ddot E" +


I guess expanding to \grave A makes sense. Can you add a MathML fixture test? I'm curious to see if we recombine the accent with the letter when we render to MathML.

They render exactly how accent commands do, which I learned is via <mover accent="true">. I think this makes sense... Test added.

k4b7 · 2017-11-25T23:37:53Z

src/Lexer.js

 const commentRegexString = "%[^\n]*[\n]";
 const controlWordRegexString = "\\\\[a-zA-Z@]+";
 const controlSymbolRegexString = "\\\\[^\uD800-\uDFFF]";
+const combiningDiacriticalMarkString = "[\u0300-\u036f]";


This contains many more characters than we support. Where do we generate an error if someone uses an unsupported combining mark? Maybe add a test for this.

I added an error test (in error-spec.js because I wanted toFailWithParseError). Incidentally, most of my tests are in katex-spec.js because I wanted toParseLike, but they would make more sense in unicode-spec.js... Not sure how to best share code between these test modules.

I've created #1021 to extract helper functions into test-utils.js. I can do that this weekend and then we could rebase this with the tests in the proper suite.

k4b7 · 2017-11-25T23:43:40Z

src/Parser.js

+        }
+        // Decompose symbol into base and combining characters
+        // if the combined symbol is not recognized.
+        text = text.normalize('NFD');


This is super cool. I wish I would've known about this sooner. There's no support for this in IE and Safari only got this in version 10. We might consider using https://github.com/walling/unorm instead.

Ah, I should have checked the browser compatibility matrix for String.normalize. The only problem with a polyfill is that it increases the katex.min.js build by 60%, from 226,836 bytes to 363,641 bytes (as any polyfill requires a huge table of Unicode symbols).

So I think this raises a serious question of whether we should:

Include the polyfill.

Not include the polyfill, and support Unicode in all supported browsers except IE.

Rewrite the PR to avoid String.normalize, at least in the built form. We actually only want symbols that expand to accents within our supported set, so the symbol table we could generate at build time could be smaller than unorm's.

Drop this PR altogether (though this is probably a desirable feature, and IE probably isn't evolving).

I think option 2 or 3 is best. Thoughts? I can look into option 3.

It's in Edge so my vote would be for option 2. If people need to support IE then they can add the polyfill themselves. We should document this requirement.

Sounds good. That's the easiest! I'll add a fallback when normalize isn't implemented, maybe with a warning, and add documentation.

k4b7 · 2017-11-25T23:49:16Z

src/Parser.js

+        } else if (cjkRegex.test(text)) {
+            this.consume();
+            return newArgument(
+                new ParseNode("textord", text, this.mode, nucleus));


If we bailing here do we need to include the combiningDiacriticalMarkString with surrogate pairs in the regex in Lexer.js?

combiningDiacriticalMarkString is already included in Lexer's tokenRegex both after a single codepoint and after a surrogate pair.

We have to lex the tokens before we can test the tokens against the cjkRegex here, d'oh.

k4b7 · 2017-11-25T23:49:42Z

src/Parser.js

+                const accent = match[0][i];
+                if (!accents[accent]) {
+                    throw new ParseError(`Unknown accent ' ${accent}'`, nucleus);
+                }


Good enough for me.

k4b7 · 2017-11-25T23:50:37Z

src/symbols.js

-    defineSymbol(math, main, mathord, ch, ch);
-    defineSymbol(text, main, textord, ch, ch);
-}
-


Nice clean up!

k4b7 · 2017-11-25T23:53:00Z

src/Parser.js

+        // Transform combining characters into accents
+        if (match) {
+            for (let i = 0; i < match[0].length; i++) {
+                const accent = match[0][i];


Are there any unicode glyphs that include multiple accents? If so, we should test that this works correctly.

I believe the answer is "no" -- all the examples I've seen (e.g. here) show up to one accent in the first character, and the rest obtained with combining characters. But given the combiningDiacriticalMarkString in the Lexer, there could be arbitrarily many combining characters at this point in the code.

k4b7 · 2017-12-13T02:52:12Z

@edemaine this is looking pretty good. It looks like some of the screenshot tests use Ç and ç so those will have to be changed or we could add support for them even though we don't have the correct glyphs and fix that issue in a separate PR. Your call.

k4b7 · 2017-12-14T15:49:26Z

src/symbols.js

-const mathTextSymbols = "0123456789/@.\"";
-for (let i = 0; i < mathTextSymbols.length; i++) {
-    const ch = mathTextSymbols.charAt(i);
+for (const ch of "0123456789/@.\"") {


Nice cleanup, but I think this may actually cause the bundle size to increase quite a bit b/c it requires a polyfill for the string iterator which I believe uses .codePointAt() which also ends up be polyfilled. Can you generate a build and see how much the bundle size has increased?

I don't think there are polyfills involved, but this is what it expands to:

var _iteratorNormalCompletion = true; var _didIteratorError = false; var _iteratorError = undefined; try { for (var _iterator = (0, _getIterator3.default)("0123456789/@.\""), _step; !(_iteratorNormalCompletion = (_step = _iterator.next()).done); _iteratorNormalCompletion = true) { var _ch = _step.value; defineSymbol(math, main, textord, _ch, _ch); } // All of these are textords in text mode } catch (err) { _didIteratorError = true; _iteratorError = err; } finally { try { if (!_iteratorNormalCompletion && _iterator.return) { _iterator.return(); } } finally { if (_didIteratorError) { throw _iteratorError; } } }

Or minified:

var M=true;var _=false;var S=undefined;try{for(var z=(0,n.default)('0123456789/@."'),T;!(M=(T=z.next()).done);M=true){var A=T.value;u(o,f,k,A,A)}}catch(e){_=true;S=e}finally{try{if(!M&&z.return){z.return()}}finally{if(_){throw S}}}var C=true;var O=false;var N=undefined;try{for(var j=(0,n.default)('0123456789!@*()-=+[]<>|";:?/.,'),L;!(C=(L=j.next()).done);C=true){var E=L.value;u(s,f,k,E,E)}}catch(e){O=true;N=e}finally{try{if(!C&&j.return){j.return()}}finally{if(O){throw N}}}

So it does add a few hundred bytes. Shall I switch back?

(To clarify, _getIterator3 gets defined even without these lines.)

The code that it generates is pretty awful. We should experiment with loose mode for the for-of plugin for babel sometime, see https://babeljs.io/docs/plugins/transform-es2015-for-of. Interestingly, when iterating over an array literal it will output a basic for loop. Maybe we could do for (const letter of "abc...".split()) { ... } to force this optimization.

How about the following, which Babel seems to leave as-is?

"0123456789/@.\"".split().forEach(function () { defineSymbol(math, main, textord, ch, ch); });

Wow, this is actually fastest! jsperf benchmark

I wasn't expecting that. Awesome!

Oops, that's because I was using split wrong. It's now the slowest... So perhaps I should go back to a regular for loop.

right, b/c it still has to call charCodeAt... yeah, let's go back to regular for loops.

edemaine · 2017-12-14T16:25:04Z

@kevinbarabash Looking at the history, I realize that the "weak" support for Latin-1 characters (using the wrong font) is actually really old, so we should keep support for those characters for backwards compatibility. I added a TODO comment about how they're currently supported "incorrectly".

I also updated the Unicode screenshot, which shows proper support for many characters but not all (like Çç). It also highlights an issue I've known about but we should probably discuss: ï renders with an accent on top of a dot (and similarly for all accented is and js). To fix this in text mode, we need \i and \j support (https://github.com/KaTeX/katex-fonts/issues/2), but I added a fix for it now in math mode because we have \imath and \jmath.

k4b7 · 2017-12-14T18:03:20Z

Not sure why travis isn't running the tests. I had to restart the tests manually.

edemaine · 2017-12-14T19:31:30Z

I added a test and warning for normalize(), and tested that it occurs when using IE (but not otherwise). And switched back to vanilla for loops.

edemaine · 2017-12-14T21:16:27Z

I think this is ready for final review. We can move the tests around later (?).

k4b7 · 2017-12-17T01:34:08Z

@edemaine sorry I've been holding off on approving this. I've been doing some more thinking about making people add their own polyfill for IE support and I wonder how many people are still supporting IE. For the record KA is still supporting IE10+. Although we haven't upgraded KaTeX recently, if we were to do so we probably wouldn't want to include the whole normalize polyfill and would end up writing our polyfill which covers what we using in KaTeX. At that point we may as well include the code in KaTeX. Later, we could add a build setting to leave it out for people who aren't supporting IE.

edemaine · 2017-12-17T02:21:21Z

That makes sense. Maybe I should try to get it working with normalize only during build, and not requiring it at run time, then.

Syndesi · 2017-12-17T16:24:39Z

Hi,
if I get it right, the width and length of supported characters are currently hardcoded and therefore unknown characters won't get rendered or break the formula.
Wouldn't it be better when the major characters are hardcoded (to improve the performance) and all other strings are put into an invisible paragraph which is then tested for its width & height? With this solution it should be possible to use every language and even those whose width depends on their context (e.g. Arabic).

edemaine · 2017-12-17T16:44:59Z

@Syndesi I'd suggest opening a separate issue (or PR) for this idea. Actually, there might be one; I'd need to search. Rendering repeatedly in the browser to measure box sizes is the general approach of MathJax, which is the main reason (I believe) it's so slow in comparison to KaTeX. It also prevents server-side rendering, which KaTeX currently supports. So it's unlikely that this will happen, but feel free to open/find a discussion about it.

k4b7 · 2017-12-17T21:45:18Z

Maybe I should try to get it working with normalize only during build, and not requiring it at run time, then.

@edemaine were you thinking of using normalize at build time to generate a table or something containing the split characters?

edemaine · 2017-12-17T22:15:26Z

@kevinbarabash Yes, exactly. Maybe there's a src/unicode.js, which can be recompiled (e.g. if Unicode changes) via make src/unicode.js, which runs some code (e.g. src/make-unicode.js, or located elsewhere) not included in the KaTeX build, and only the latter uses normalize/unorm. Similar to fontMetrics making, I suppose -- I can look more at how that's done.

I'd set this up only to recognize accents that KaTeX can handle, so it'd hopefully be not too large a list. We can still handle combining characters in exactly the way of this PR, just skipping the normalization. Instead, I think I'd define (in src/unicode.js) macros that map e.g. á to a followed by an acute combining character.

edemaine · 2017-12-18T17:44:21Z

OK, I gave a crack at this: src/unicodeMake.js generates src/unicodeSymbols.js via make unicode or make src/unicodeSymbols.js.

Multi-accented characters were challenging to get right (glad we added a test for them!), and it turns out there are characters with two accents. I checked and there aren't any triple-accented characters.

I'm not sure of the best location for unicodeMake.js. We don't have a util directory, which would be a natural place. It could also be in the root directory. But in src it immediately gets linted and such.

k4b7 · 2017-12-19T00:07:53Z

src/unicodeSymbols.js

@@ -0,0 +1,290 @@
+// This file is GENERATED by unicodeMake.js. DO NOT MODIFY.


k4b7 · 2017-12-19T00:08:05Z

src/unicodeSymbols.js

+// This file is GENERATED by unicodeMake.js. DO NOT MODIFY.
+
+export const unicodeSymbols = {
+    "\u00e1": "\u0061\u0301",  // á = \'{a}


I love the comments.

k4b7

I have a few minor concerns/questions, but other than that it looks really solid.

k4b7 · 2017-12-19T02:30:57Z

src/unicodeMake.js

+
+console.log("// This file is GENERATED by unicodeMake.js. DO NOT MODIFY.");
+console.log("");
+console.log("export const unicodeSymbols = {");


Do we need this in addition to the export default?

I'm still somewhat new to ES6, so I'm not sure what's normal here. I still find it weird that export default const is forbidden. I believe we could just do export default {...} if we don't need to name the object, so maybe that's best? The current system allows importing in two different ways (default or nondefault), but that's probably not necessary.

I find the lack of support for export default const weird as well. You can do:

export { foo as default };

I guess it's a moot point now since there no longer seems to be an export default in this file.

k4b7 · 2017-12-19T02:35:24Z

src/unicodeMake.js

+// Read supported accents from Parser.js
+const Parser = fs.readFileSync('./Parser.js', {encoding: 'utf8'});
+const match = /accents = [^;]*;/.exec(Parser);
+eval(match[0]);


Maybe we could have Parser.js export accents instead.

I should add a comment about this. I originally tried it that way, but Node doesn't support import, and I'm running unicodeMake.js in Node, so Babel would have to get involved, and the accents are not available in built katex.js. I'm open to better suggestions! We could, for example, put accents here in unicodeMake.js and/or unicodeSymbols.js and import them in Parser.js -- perhaps that's best.

OK, I tried moving them. It's not especially pretty -- I still evald the code in unicodeMake, which led to extra \ escaping in the accents definition... Let me know if you see a cleaner way! But at least I'm no longer parsing Parser.js with a regex!

k4b7 · 2017-12-19T02:36:10Z

src/unicodeMake.js

+    for (const accent of Object.getOwnPropertyNames(accents)) {
+        const combined = letter + accent;
+        const normalized = combined.normalize('NFC');
+        if (normalized.length === 1) {


This is super cool.

k4b7 · 2017-12-19T02:37:28Z

test/katex-spec.js

-    });
-
-    it("should parse 'ΓΔΘΞΠΣΦΨΩ'", function() {
-        expect("ΓΔΘΞΠΣΦΨΩ").toParse();


I see, these got moved down to unicode-spec.js.

k4b7 · 2017-12-19T02:48:39Z

test/katex-spec.js

+        // TODO(edemaine): Unsupported Latin-1 letters in text: ÆÇÐØÞßæçðøþ
+        expect("\\text{ÀÁÂÃÄÅÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝàáâãäåèéêëìíîïñòóôõöùúûüýÿ}")
+        .toParseLike(
+            "\\text{\\`A\\'A\\^A\\~A\\\"A\\r A" +


\\\" caught me off-guard until I realized that the " need to be escaped as well.

k4b7 · 2017-12-19T02:48:57Z

test/unicode-spec.js

-        expect('여보세요').toNotParse();
+    it("should parse CJK outside \\text{}", function() {
+        expect('私はバナナです。').toParse();
+        expect('여보세요').toParse();


This change isn't really related to accents and should probably be in a separate PR.

It's relevant in that this PR rewrites how we parse Unicode. The relevant line change is from if (this.mode === "text" && cjkRegex.test(nucleus.text)) to if (cjkRegex.test(text)). I can't make a separate PR now because they'd both change that line, but if you like, I can remove that change from here and add a second PR once this one is accepted.

FWIW, this one-line change would fix #895.

I can remove that change from here and add a second PR once this one is accepted.

I think that makes the most sense. Thanks. :)

k4b7 · 2017-12-19T02:51:25Z

src/unicodeMake.js

+// This is an internal Node tool, not part of the KaTeX distribution,
+// whose purpose is to generate unicodeSymbols.js in this directory.
+// In this way, only this tool, and not the distribution/browser,
+// needs String's normalize function.


Can you include when this tool should be run?

k4b7 · 2017-12-19T02:52:46Z

Makefile


+unicode:
+	cd src && $(NODE) unicodeMake.js >unicodeSymbols.js
+src/unicodeSymbols.js: unicode


I wonder if we can run this as part of the build and then not even bother with committing the result. That we don't have to worry about whether people are running it when they should or not.

My motivation for committing it is so that people wouldn't worry about when to run it. (You need to run it whenever we add support for a new accent, or when the Unicode spec changes, neither of which I imagine being very common.) Another advantage is that we can directly see what characters are supported, so it's easy to complain about a missing one, and for PRs to show what additional characters are being supported.

If we do it during build, would we also have to do it when running the test server...? This approach felt more complicated to me, but I'm happy to have a specific suggestion.

Let's leave it like this for now. I definitely like it committing it from a documentation point of view. I don't think we have much work left to do on accents so there shouldn't be too many PRs which is affected by this.

k4b7 · 2017-12-20T03:56:51Z

src/unicodeMake.js

+    '\\u030a': {text: '\\\\r'},
+    '\\u030b': {text: '\\\\H'},
+}`;
+eval(accentsCode);


I'm confused why the eval is necessary, can't we just console.log(accentsCode); this as is?

A better solution might be to do:

const {accents} = require('./unicodeSymbols.js'); const encodedAccents = {}; for (const [key, value] of Object.values(accents)) { const encodedEntry = {}; encodedAccents[encode(key)] = encodedEntry; if (value.text) { encodedEntry.text = encode(value.text); } if (value.math) { encodedEntry.math = encode(value.math); } }

This way we have one source of truths for the accents.

First, the eval is necessary to load this data into the current module. console.log is what outputs the same code to unicodeSymbols.js. In this way, there is already just one source, accentsCode in unicodeMake.js, which generates unicodeSymbols.js which has a copy of the data, but that file should never be modified...

Your bootstrapping approach above is neat, but feels awkward. You're asking unicodeMake.js to first import unicodeSymbols.js, but then modify unicodeSymbols.js... In particular, it won't work with the > piping as is, and is fragile if e.g. unicodeSymbols.js ever got corrupted (e.g. from a bug in unicodeMake.js), then you'd lose the accents table.

I have a third approach: make a unicodeAccents.js that just gives the accents list, and is written with Node-style modules (module.export = ... instead of export). Then it should be both requireable in unicodeMake.js and (I think) importable by the rest of KaTeX.

k4b7 · 2017-12-20T04:08:20Z

@edemaine do you know what package we need to load to render the Unicode test case in LaTeX?

edemaine · 2017-12-20T19:08:22Z

@kevinbarabash I think mainly we need to use xelatex instead of pdflatex. So I don't think we can easily get texcmp to do the work for us (at least not in the Docker, which doesn't have xelatex installed). Running node texcmp.js locally with a modified texcmp.js, I got this image:

Looks like I must have to include some more packages to get Cyrillic and CJK fonts. There's also the known "missing \i" and "missing cedilla" issues, along with some kind of horizontal spacing issue I'm not sure about. I don't think the other differences are specific to this PR, though, but rather our existing accent mechanism.

edemaine · 2017-12-20T19:28:17Z

I tried putting the accents in their own Node-style module, unicodeAccents.js. This seems to avoid all the messiness (the only blemish is one use of module.exports = instead of export default, but they behave the same). Should be good to go now.

(I also rebased, so it will look like the old commits repeating themselves. Only the last 3 are new.)

k4b7 · 2017-12-22T04:33:38Z

The good news:

the base characters are using computer modern now, yay!
grave, acute, and circumflex accents have better shapes than before

Areas for improvement:

grave, acute, and circumflex accents on capitals need to be rotated or squished a bit
umlaut circles seem a bit too large
i should use the dotless i when it has an accent*
the horizontal spacing around i and I is too much*

The items that are starred are things that should probably be resolved before merging this PR.

k4b7 · 2017-12-22T04:36:24Z

@edemaine thanks for refactoring unicodeAccents and revert the CJK change. :)

edemaine · 2017-12-22T15:19:20Z

@kevinbarabash I guess for backwards compatibility, we could just let ï pass through as a character and render in Times. Would that be acceptable? We can't otherwise fix the double accent issue in text mode (it already is fixed in math mode) until https://github.com/KaTeX/katex-fonts/issues/2 gets solved.

I'm guessing that the squishing of the accents on capital letters is also some kind of font issue (another symbol that we need), but I don't actually see where it's implemented in LaTeX... I just see \DeclareTextAccentDefault{\'}{OT1}. Dot size is presumably also a font issue...

The horizontal spacing is definitely something that can be fixed. It's an existing issue with \acute{i}, \acute{\imath}, and \text{\'i} in KaTeX, not new to this PR (but this PR exacerbates it to apply to ï as well). I think we just never handled the case of the innards being less wide than the accent. I'll take a look... But we could also leave this to another PR, if we hackily let ï pass through.

This is important for multi-accented characters.

edemaine · 2017-12-28T21:46:35Z

New texcmp with #1033 merged (via rebase):

Ready to merge? Still the cedilla issue and accent-on-capital letter issue, which we can add. (Neither specific to this PR.)

I also added support for accented Greek letters, as I recalled/noticed there are a handful of those as well. (I didn't add e.g. Greek Capital Letter A(lpha) with acute accent because we don't support Greek Capital Letter A(lpha) yet.) @ronkok Does unicode-math or something alias that character to a Latin A?

ronkok · 2017-12-29T02:48:35Z

@edemaine, I am not aware of a stand-alone Unicode character for Greek capital alpha with acute accent, so unicode-math does not have such an alias. There is Ά, that is U+0386, Greek capital alpha with tonos, but unicode-math does not have an alias for it, either.

k4b7 · 2017-12-29T06:05:49Z

Ready to merge? Still the cedilla issue and accent-on-capital letter issue, which we can add. (Neither specific to this PR.)

I think so. We can deal with the cedilla and \H issues separately.

k4b7 · 2017-12-29T06:19:38Z

@edemaine I removed some of the letters from the math mode test since we aren't rendering them with the correct font anyways.

k4b7 · 2017-12-29T06:35:28Z

@edemaine thanks for all of the hard work on this. I'm glad it's finally merged.

edemaine · 2017-12-29T15:59:47Z

@ronkok I was indeed referring to U+0386 Ά which is "Greek Capital Letter A with acute accent" according to this list. (I assume A means Alpha here, though I'm no expert.) I see that other lists call it "GREEK CAPITAL LETTER ALPHA WITH TONOS", but it does canonicalize into U+0391 (Greek Capital Alpha) followed by U+0301 (combining acute accent). There are similar "Latin-looking" (and thus no-LaTeX-macro) characters Έ Ή Ί Ό Ύ. But if LaTeX doesn't support them, fine for us not to.

@kevinbarabash Thanks for pushing and helping to make this PR so much better! Together with @ronkok's Unicode symbol support, the next KaTeX release will be really Unicode friendly!

ronkok · 2017-12-30T15:45:34Z

Very cool stuff. I'll start to write it into the KaTeX function support page.

@edemaine, when I submit the PR for the updated function support page, I would appreciate if you would look it over. I'm not sure that I have picked up on everything that you and @kevinbarabash are doing here.

edemaine · 2017-12-30T18:09:44Z

@ronkok Sounds good! For this PR, the new single symbols should all be in src/unicodeSymbols.js. They currently work in both text and math mode (but I'm not sure they should work in math mode -- see #1046). In addition, the combining characters for all the supported text and math accents work, as listed in src/unicodeAccents.js. In addition, #1030 added \ae, \AE, \oe, \OE, \o, \O, \ss and their Unicode equivalents to text mode.

Thanks for continuing to keep the documentation up-to-date!!

ronkok · 2017-12-30T18:19:18Z

I'm not sure they should work in math mode

For now, I'll write it up to suggest that they work only in text mode. I would like to keep math mode out of the docs until we're sure that it won't break future work. We can always change the docs later to include math mode.

k4b7 requested changes Nov 25, 2017

View reviewed changes

This was referenced Dec 12, 2017

conditional support in macros #1003

Open

"KaTeX parse error: Expected 'EOF', got '<bla>' at position 1" when input pure Chinese character(s) #895

Closed

k4b7 reviewed Dec 14, 2017

View reviewed changes

edemaine force-pushed the accents2 branch from afe4907 to 018187a Compare December 14, 2017 16:33

edemaine force-pushed the accents2 branch from c09228c to 73642c4 Compare December 14, 2017 19:30

Syndesi mentioned this pull request Dec 16, 2017

Unicode-Support waylonflinn/markdown-it-katex#16

Open

k4b7 reviewed Dec 19, 2017

View reviewed changes

src/unicodeSymbols.js

@@ -0,0 +1,290 @@

// This file is GENERATED by unicodeMake.js. DO NOT MODIFY.

Copy link

Member

k4b7 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

k4b7 reviewed Dec 19, 2017

View reviewed changes

k4b7 reviewed Dec 20, 2017

View reviewed changes

edemaine force-pushed the accents2 branch from 67e8078 to f18047b Compare December 20, 2017 19:31

edemaine added 12 commits December 28, 2017 16:33

Move normalize dependency to unicodeMake.js

2d5fd37

Make unicodeSymbols into a lookup table instead of macros

1cf4504

This is important for multi-accented characters.

Add comments about when to run

fa14670

Move symbols definition into unicodeMake/Symbols.js

d63da37

Remove CJK support in text mode

00aeeb0

Add missing semicolon

f92366d

Refactor unicodeAccents to its own file

ba805a5

Dotless i/j support in text mode

075d8e1

Remove excess character mappings

57030d5

Fix Åå in math mode (still via Times)

732dc09

Update to support KaTeX#1030

45a901f

Add accented Greek letter support (for supported Greek symbols)

f439b27

edemaine force-pushed the accents2 branch from 401e92f to f439b27 Compare December 28, 2017 21:33

edemaine added the accessibility label Dec 28, 2017

Update screenshot

b76cfed

k4b7 added 2 commits December 28, 2017 22:56

Merge branch 'master' into accents2

3e0f6af

Merge branch 'master' into accents2

0bd5ad6

k4b7 approved these changes Dec 29, 2017

View reviewed changes

remove Æ, æ, Ø, ø, and ß from math mode test

4a87f38

k4b7 merged commit 484d44e into KaTeX:master Dec 29, 2017

edemaine mentioned this pull request Dec 29, 2017

Unicode text characters in math mode #1046

Closed

		@@ -0,0 +1,290 @@
		// This file is GENERATED by unicodeMake.js. DO NOT MODIFY.

Uh oh!

Conversation

edemaine commented Nov 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k4b7 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k4b7 commented Dec 13, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edemaine Dec 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edemaine commented Dec 14, 2017

Uh oh!

k4b7 commented Dec 14, 2017

Uh oh!

edemaine commented Dec 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edemaine commented Dec 14, 2017

Uh oh!

k4b7 commented Dec 17, 2017

Uh oh!

edemaine commented Dec 17, 2017

Uh oh!

Syndesi commented Dec 17, 2017

Uh oh!

edemaine commented Dec 17, 2017

edemaine commented Nov 25, 2017 •

edited

Loading

edemaine Dec 14, 2017 •

edited

Loading

edemaine commented Dec 14, 2017 •

edited

Loading

edemaine commented Dec 18, 2017 •

edited

Loading