Adds ReaderTypography (replaces ReaderHyphenation)#6072
Adds ReaderTypography (replaces ReaderHyphenation)#6072poire-z merged 4 commits intokoreader:masterfrom
Conversation
Replace Hyphenation menu with Typography menu. This works mostly like before: - typography/hyphenation is chosen according to the book metadata language, - one can set a default or a fallback language, - hyphenation is now just a subset of typography, and can be disabled while still setting a language and enabling the other features, - the typography language enables newly added features to crengine: per-language line breaking rules and per-language Harfbuzz glyph selection.
|
I'm kind of on the fence about "Typography" on its own and/or when paired with "language". I don't hate it, but something still bothers me a bit with it, not sure what, exactly... What about "Typography rules"? (in both cases, I think when a language name is shown, it should be pretty obvious that it's a language ;p. The hyphenation dictionary entry follows a similar logic ;).). |
| -- B = language specific additional line breaking tweaks | ||
| -- Update them when language tweaks and features are added to crengine/src/textlang.cpp | ||
| local LANGUAGES = { | ||
| -- lang-tag aliases features menu title |
There was a problem hiding this comment.
Random fun fact: 2 letters language codes are legacy ISO 639-1 (-ish, I think the language_DIALECT bastardization is a GNU thing) while the three letters ones are ISO 639-2 (where I learned that fre is technically for Belgian French ;)).
There was a problem hiding this comment.
Or it's actually https://tools.ietf.org/html/rfc5646, as ReaderTypography:parseLanguageTag() mentions below ;).
There was a problem hiding this comment.
Well, I did not dig into that much (or at all). Not much interested in learning about that - so, if anyone wants to specialize and add stuff so we do the right thing with tags we'll meet in the wild, please jump on it :) (here and in textlang.cpp).
One thing I get is that Harfbuzz has lots of code to handle that/translate, and I hope it will do the right thing with what we provide as is. (Except may be with the 3 letters language code like "fre" that we translate here...)
There was a problem hiding this comment.
One small comment: it should be “Ukrainian” on line 49
There was a problem hiding this comment.
I do have at least one Belgian French book that uses “ ”. I don't know if that's specifically Belgian and if it actually makes any difference to the algorithm.
There was a problem hiding this comment.
One small comment: it should be “Ukrainian” on line 49
That's just a filename (well, that's how that file is named for ages), that might be saved in current users settings.
I was going to say it won't be shown, but it will, as the selected hyph dict.
So, dunno if I should rename it (may be later, when things are migrated), or if we should have another table for translating typos in filenames...
There was a problem hiding this comment.
@poire-z
Sorry I didn’t know that :) Looks like it comes all the way from the first ones who PR’ed Ukrainian language files
There was a problem hiding this comment.
@mergen3107 On the contrary, thanks for pointing it out!
@poire-z If I go to hyphenation it already says Ukrainian, which probably implies we already have that?
There was a problem hiding this comment.

We have the other mapping lang_tag => language shows alongside typography, but we can't really use it, as zh-CN would show "Chinese (Simplified)" while it uses English_US.pattern.
I don't mind a small typo table or better: a function to remove the .pattern and fix some known typos, to be used instead of:
hyph_dict_name = hyph_dict_name:gsub(".pattern$", "")
There was a problem hiding this comment.
Well, I guess I'll update the HYPH_DICT_NAME_TO_LANG_TAG table to include a translatable string of the language - instead of using the filename stripped from .pattern, which won't be translatable.
If we remove the .pattern for pretty menus, I guess there's no more reason to keep the relation to the hyph dict filename.
| -- one is set, no fallback will ever be used - if a fallback one | ||
| -- is set, no default is wanted; so when we set one below, we | ||
| -- remove the other). | ||
| text = T( _("Would you like %1 to be used as the default (★) or fallback (�) typography language?\n\nDefault will always take precedence while fallback will only be used if the language of the book can't be automatically determined."), BD.wrap(lang_name)), |
There was a problem hiding this comment.
Hmm, that one's mildly trickier with my initial comment about typography vs. typography rules...
...) language for typography rules?
Maybe?
There was a problem hiding this comment.
I think language for typography rules is clearer than typography language.
There was a problem hiding this comment.
Thanks for the previous feedback.
One thing just to be sure: it's really typography rules/languages, and not typographic rules like I'd be inclined to write in french?
There was a problem hiding this comment.
I'd say they're both correct. Google Books seems to agree.
There was a problem hiding this comment.
Much like règles de typographie and règles typographiques in French :).
| }) | ||
|
|
||
| table.insert(self.menu_table, { | ||
| text = _("Follow document embedded lang tags"), |
There was a problem hiding this comment.
document's
Or simply Honor embedded lang tags?
There was a problem hiding this comment.
Something like use instead of honor is probably a bit easier to understand.
There was a problem hiding this comment.
Use embedded language tags ?
Use document's embedded language tags ?
language tags (for gramatically english) or lang tags (for HTMLically english :) ?
I'm not fond of the word "honor", but it's clearer that just "use". "Use" is a bit vague.
It's not like we "use" them, we "use" something else that will do according to them.
We don't use road signs :)
There was a problem hiding this comment.
You could just leave out the verb; the checkmark already says use/honor/etc.
There was a problem hiding this comment.
[ ] Document's embedded lang tags looks very very strange and incomplete to me :)
| hold_callback = function() | ||
| local text_lang_embedded_langs = G_reader_settings:nilOrTrue("text_lang_embedded_langs") | ||
| UIManager:show(MultiConfirmBox:new{ | ||
| text = text_lang_embedded_langs and _("Would you like to enable or disable support for document embedded lang tags by default?\n\nThe current default (★) is enabled.") |
There was a problem hiding this comment.
_("Honor embedded lang tags")
Would you like to honor or ignore embedded lang tags by default? is fine by me, because Ignore | Honor buttons rhymes likes Disable | Enable and I like when things sing :)
Fine with you ?
There was a problem hiding this comment.
Is this stuff about honoring somewhat Frenglish? Or are we just very corteous? :-P
There was a problem hiding this comment.
Dunno, It's understandable in that sense in french, and @NiLuJe uses it quite often in his comments. What would be another word for that, less generic than "use" ?
(But in french, it's also used in a meaning a bit more further than corteous, rather meaning intercour-teou-se :) which is the only reason I'm a bit not fond of using it for our text :)
There was a problem hiding this comment.
Well, what we want is a positive antonym to "ignore".
And there's not many of them in the various thesaurus and dict on the web.
"Honorer" is the right one in french - didn't see "Honor" among the english antonyms to "ignore"...
The only one I found that could work in english in our context is: Obey (or Follow, that I had initially).
There was a problem hiding this comment.
There was a problem hiding this comment.
my vote would be for @Frenzie
Which one ?
There was a problem hiding this comment.
Which one ?
I meant as far as the wordy ones are concerned: Typography rules as per embedded lang tags.
But, to be clear, I much prefer Respect embedded lang tags ;).
There was a problem hiding this comment.
I'm fine with all of these:
Typography rules as per embedded lang tags
Rules as per embedded lang tags
As per embedded lang tags
Respect embedded lang tags
So, please you both agree on one :)
(Really nothing could work with an initial verb like Adapt/adjust - to not have the feeling that if you check that box, and the document has no embedded lang tags, you will get no rule applied?)
There was a problem hiding this comment.
The respect one is fine by me. (I thought I already posted this. :-) )
Agreed (on both counts ;)). |
|
use relevant typographic rules while rendering their content use language-specific/tailored typography rules while rendering their content |
|
to render their content Not sure about the typography vs typographic but it doesn't seem like it'd hurt while adding some variation? |
|
You can avoid the double "while" by keeping the first "while", and switching the second to "whereas" (which is used to emphasize contrast in these kind of constructs). EDIT: Actually, no, that doesn't quite work here, it's not a "while blah blah" vs. "whereas bleh bleh" contrasting construct. TL;DR: +1 for |
|
@roshavagarga , just answering your question on Gitter (even if I'm not our font guy/girl/entity :)
See top post here and koreader/crengine#337 . You can get Bulgarian glyphs provided:
|
|
@poire-z Thanks for the information. Would it be able to add a check somewhere? If there's no font that support Bulgarian typography, but it's been enabled manually or automatically, then Koreader should send out a message noting the lack of a compatible font? Option B is to just include 1 font that supports Bulgarian typography by default. |
No, we can't know that. That's handled by low level HarfBuzz, that will give us the glyphs it decides without telling us how/why it did.
That's for our our font guy/girl/entity :) (@NiLuJe) although if the Noto fonts do not, it would be sad to have to put another one before these that would be used instead for anything latin... |
|
Ah, I see, thanks for the explanation once again! Edit: |
|
Might be a job for #6272 then ;). |
|
@roshavagarga: I can confirm that Libertinus font correctly supports Serbian Cyrillic specific glyphs. |







Replace Hyphenation menu with Typography menu.
This works mostly like before:
lang=attributes to adapt typography/hyphenation/line breaking rules to which language the book says this section is in:People who'd like some behaviour fixes/tweaks for their own usual languages can add or suggest stuff (alas to be hardcoded in the sources) in https://github.com/koreader/crengine/blob/master/crengine/src/textlang.cpp.
This should solve a few issues (even if we got no report for them :):
Word before a quotation mark will not be hyphenated #5645 (comment)Screenshots, for the usual rewording suggestions :):
(I added some icons from nerdfont - because I needed a break and that's always some fun chosing - for the typography features per languages, dunno if it's a good idea or if its confusing, or if nobody will care :)
Typography language submenu:
First item shows the book metadata lang tag, for easy going back to it:

Hyphenation submenu:

For a typography language, it allows switching between these last 3 hyphenation methods (didn't know if I could get rid of Algorithmic, so, well, it's available as an option with all typography languages).
This change is