-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Currently countries with more than 700 million people combined use Persian, Arabic and Urdu alphabets worldwide.
These countries also have a huge number of students. Iran itself has 18 million students for example.
Because of this we see more and more educational startups that are targeting these countries.
I tried to add support for symbols of these languages to KaTeX with as minimal footprint as possible.
I think the results are pretty good. I haven't added any screenshotter tests yet but by testing with make serve and by eye I think I got a fairly precise result.
We face 2 challenges when we try to add these symbols:
- They use Eastern Arabic Numerals as opposed to (Western) Arabic Numerals in their schools.
- In English alphabet for example each character code has a single glyph with a specified size but The characters of these alphabets are cursive so they change shape and size.
- Adding the symbols was easy because like CJK characters, all these alphabets are in a uniform
[0x0600 - 0x06FF]unicode range. - I couldn't use KaTeX default fonts because they don't have this character set and the glyphs have serious problems. So I added a new (open) font called Vazir-Code. The biggest advantage of this font is that it is monospace so cursive characters have a roughly same height and width everywhere.
- I calculated font metrics using awesome fontkit package.
- Then I added the Eastern Arabic numerals as
textordsinmathmode and alphabet astextordsintextmode. - finally I added a new font beside
mainandmathfont. - in
buildCommonI add a new css class that only sets the font-family to Vazir-Code.
IMHO we can have 2 different approaches from here:
- If the changes generally looks good to you, I can add the font to the main
fontsfolder of KaTeX, calculate metrics using the python script and finally add some test and screenshotter test. - We also can add a small plugin system to KaTeX so everyone can add their own fonts and css to the system an library should consume the plugins symbols.
IMHO first approach is better solution here because Latin (I mean every latin based alphabet), CJK and Persian-Arabic alphabets constitute majority of the alphabets in use.
You can find my changes in this commit: HosseinAgha@8696318
It is just a proposal (proof of concept) and I have not created a pull request yet.
I really appreciate your opinion on this.