Skip to content

Fixes widechar table lookup#2943

Closed
SoulSpirit79 wants to merge 4 commits intoMudlet:developmentfrom
SoulSpirit79:fix-widechar-detection
Closed

Fixes widechar table lookup#2943
SoulSpirit79 wants to merge 4 commits intoMudlet:developmentfrom
SoulSpirit79:fix-widechar-detection

Conversation

@SoulSpirit79
Copy link
Copy Markdown

Brief overview of PR changes/additions

Related to issue #2379
Rendering width of characters in TConsole was wrong for some glyphs because the lookup algorithm of unicode ranges was faulty (it converted unsigned ints to signed ints). This patch fixes the algorithm.

Motivation for adding to Mudlet

Some wide characters (like emojis) were interpreted as single-width instead of double-width.

Output before the patch:
a

Output after the patch:
b

@SoulSpirit79 SoulSpirit79 requested a review from a team as a code owner August 6, 2019 10:11
@add-deployment-links
Copy link
Copy Markdown

add-deployment-links bot commented Aug 6, 2019

Hey there! Thanks for helping Mudlet improve. 🌟

Test versions

You can directly test the changes here:

No need to install anything - just unzip and run.
Let us know if it works well, and if it doesn't, please give details.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 6, 2019

Nice find! Thank you, we'll look at this. The original code is from https://github.com/ridiculousfish/widecharwidth so we should tell them about this too.

@wiploo
Copy link
Copy Markdown
Contributor

wiploo commented Aug 6, 2019

Tested, for me is good on windows 7 (black/white emoji). Version 4.0.3 was buggy

Thanks @SoulSpirit79

@SoulSpirit79 SoulSpirit79 requested a review from a team August 6, 2019 14:15
@SoulSpirit79
Copy link
Copy Markdown
Author

In an effort of getting better results for wide characters, I changed the way the engine creates the QRect for each glyph.
Before this commit, the width of a glyph was calculated as a multiplier of the width of the letter 'W'; such multipllier could be 1 or 2 based on a lookup table.
Now the width of each glyph is calculated making use of QFontMetrics::horizontalAdvance

For example, if the font size is set to 8, a standard lating character is 8 pixels wide.
A character like 🤮 (U+1F47F) is 18 pixels wide. That's why the right margin of the glyph was cut before this patch: it was calculated as 2 chars-wide (16px), hiding the last 2 pixels.

The drawback is that where non-standard-width characters are displayed, the text alignment isn't perfect anymore.
Here's a screenshot took in Windows 10. Note that the text alignment in the console is pixel-perfect to the text alignment in the command box:

image

If this solution is acceptable, widechar_wcwidth.h becomes obsolete and could be deleted along with all of its usages.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 6, 2019

Aha - nice find. Can you check if this new non-even characters affects line wrapping code?


auto textRect = QRect(mFontWidth * cursor.x(), mFontHeight * cursor.y(), mFontWidth * charWidth, mFontHeight);
//auto metrics = QFontMetrics(painter.font());
auto advance = mFontMetrics.horizontalAdvance(grapheme);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only available in Qt 5.11 while we support 5.7 at minimum - right now. However I'd like to up the mininum to 5.11 because Debian stable now supports it, so this shouldn't be a problem.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know sorry, I'll check if I can achieve the same result using QFontMetrics::boundingRect

Copy link
Copy Markdown
Member

@vadi2 vadi2 Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't spend time on it yet, #2944

@vadi2 vadi2 mentioned this pull request Aug 6, 2019
@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 6, 2019

Hm, try connecting to mud.pkuxkx.net:8080 - I think the spacing in the home screen is broken. It wasn't perfect before but it seems worse now, is that the same for you?

@SoulSpirit79
Copy link
Copy Markdown
Author

Aha - nice find. Can you check if this new non-even characters affects line wrapping code?

I fear wrapping will be affected.
Wrapping works with characters and should be changed to work with pixels.
In extreme use cases like this, the effect is noticeable:
image

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 6, 2019

@Eraene what do you think?

@SoulSpirit79
Copy link
Copy Markdown
Author

Hm, try connecting to mud.pkuxkx.net:8080 - I think the spacing in the home screen is broken. It wasn't perfect before but it seems worse now, is that the same for you?

I see bad characters both in 4.0.3 and this branch, but they render differently.

4.0.3:
image

Current branch:
image

I think there's something broken on the server anyway, I tried a couple of server encodings but the result doesn't change.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 6, 2019

You need to set the encoding to GBK.

@SoulSpirit79
Copy link
Copy Markdown
Author

You need to set the encoding to GBK.

Ok, got it.

4.0.3:
image

Current branch:
image

The explanation is in how whitespaces are handled: I think 4.0.3 knows that server has gbk encoding and renders whitespaces (0x20) as double-width. (but I can't find any reference to this in the code).

Current branch treats 0x20 always the same, and it's thinner than any other chinese symbol.

I think this is the correct way of handling it. Asian servers should not use 0x20 for formatting text, but a wider whitespace character:
https://en.wikipedia.org/wiki/Whitespace_character

@SoulSpirit79
Copy link
Copy Markdown
Author

I think this is the correct way of handling it. Asian servers should not use 0x20 for formatting text, but a wider whitespace character.

Found a better reference that "demonstrate" that the full-width whitespace for asian languages is 0x3000 and not 0x20:
https://en.wikipedia.org/wiki/CJK_Symbols_and_Punctuation

The first character of their unicode block is that full-width whitespace, and I doubt it is a coincidence.

src/TConsole.cpp Outdated
splitter->setParent(layer);

mUpperPane = new TTextEdit(this, splitter, &buffer, mpHost, false);
mUpperPane = new TTextEdit(this, splitter, &buffer, mpHost, false, QFontMetrics(mpHost->mDisplayFont));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Using mpHost->mDisplayFont is NOT appropriate for all instances of TTextEdit you will need to inspect the parent TConsole's (i.e. mpConsole) mType and work out which font to pass.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to understand and fix it

void TTextEdit::updateScreenView()
{
if (isHidden()) {
mFontMetrics = QFontMetrics(mpHost->mDisplayFont);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using mpHost->mDisplayFont and not mDisplayFont here?

And when you do use the latter the following lines can access the mFontMetrics member instead of each of them each retrieving a fresh instance of the relevant QFontMetrics as well...! 😉

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my bad, cut&paste misprint. I'll fix it + I was trying to minimize the modified code. I'm fixing it.

// and mIsMiniConsole is true for user created Mini Consoles and User Windows
if (mpConsole->getType() == TConsole::MainConsole) {
mFontMetrics = QFontMetrics(mpHost->mDisplayFont);
mFontWidth = QFontMetrics(mpHost->mDisplayFont).width(QChar('W'));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone else (possibly me) also has a PR pending that will change this to:

    mFontWidth = QFontMetrics(mpHost->mDisplayFont).averageCharWidth();

so there may be git conflicts to resolve in the near future...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And of course this and the following lines can all reuse the newly found mFontMetrics class instance and refer to that instead of each of them fetching a new copy...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is to completely remove mFontWidth: it's an approssimation of the char width but we could use the exact width instead. I just can't completely understand some usages of that variable and thus kept the original code.
The solution I'm proposing in this branch isn't perfect yet, but I'll work on it if it could be accepted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, mFontWidth does not make too much sense anymore, however I'm not sure about converting everything to work with pixels either just yet - because that breaks a lot of assumptions we've made so far... 😅 this will require some thinking!

{
uint unicode = getGraphemeBaseCharacter(grapheme);
int charWidth;
if (unicode == '\t') {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How well does the new code work with tab characters given that you are doing away with the current faking of (Left) Tabs...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an open point I was working on.
The code you quoted was not symmetric to the code in convertMouseXToBufferX, which was:
if (unicode == '\t') { characterWidth = 8; } else { characterWidth = getGraphemeWidth(unicode); }
and it confused me. I'll fix it as soon as possible.


/* An inclusive range of characters. */
struct widechar_range {
int32_t lo;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎 DO NOT MAKE ANY CHANGES TO THIS FILE - IT IS IMPORTED WHOLESALE FROM UPSTREAM AT: https://github.com/ridiculousfish/widecharwidth AND MAY WELL GET OVERWRITTEN THE NEXT TIME THAT THE UNICODE VERSION IS INCREMENTED.

Copy link
Copy Markdown
Member

@SlySven SlySven Aug 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry I was reviewing without looking at any past feedback from anyone else - yes if there are argument/variable type errors then they need to be passed upstream - though remember that this file is designed to work on multiple OSes and so may use some types to ensure compatibility which might not seem to be the optimum ones to us.

The reason that a few pixels may be being chopped off of the right for some Emojis and other wide glyphs is, I suspect, because it is not the same font being used for all the characters, if the specified font does not contain a wanted glyph then the Qt and Font subsystems (Font Config on Linux for instance) will (normally) work out which other fonts do have the wanted glyph for that grapheme and pull in the best (for some value of best) one they can find - however there is no guarantee that it will have the exact same font metrics +. The best way around that would be to measure the actual widths used when the glyphs are painted and to remember them for later when working out where the mouse is when it is clicked on or moved over them - mayhaps I am going to have to dust off that prototype code I cobbled together to do it like that...

+ - I would point out one exception to that: in the code for the 2D Map room symbols - where I have made it is possible to force only the specified font to be used!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was removed entirely and not used anymore @SlySven.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎 DO NOT MAKE ANY CHANGES TO THIS FILE - IT IS IMPORTED WHOLESALE FROM UPSTREAM AT: https://github.com/ridiculousfish/widecharwidth AND MAY WELL GET OVERWRITTEN THE NEXT TIME THAT THE UNICODE VERSION IS INCREMENTED.

I'm not sure whether the problem lies in the library on in some compiling flag for Mudlet.
I'm not so expert with C, but reading here and there seems like wchar_t is the correct type to handle 2-bytes or 4-bytes character. But in Mudlet dev environment it is interpreted as a 2-bytes type and cannot handle many unicode ranges.
I'd open an issue to https://github.com/ridiculousfish/widecharwidth if I'd know for sure.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was removed entirely and not used anymore @SlySven.

No it isn't - in fact I recently updated it to support Unicode 12.1 ! 😕

It's purpose is to tell the caller whether the grapheme is supposed to be drawn in a normal or wide (double the width of a normal) width - taking into account whether the character is defined as a normal or wide one or is an ambiguous width or whether it changed from being narrow in Unicode 8 to be wide in Unicode 9. If it is an ambiguous width then our code considers the choice the user made from them - or if they set it to auto - the server encoding the user has selected...

The bottom line is that it is trying to approximate the duo-space font we need for MUD type text displays...

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Asian servers should not use 0x20 for formatting text, but a wider whitespace character

That's not an option, we can't break rendering and tell them to fix it even if we might be right 😕

I don't think there is special treatment of pkuxkx but I'll verify in the code.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Yeah, there is no special treatment. In current code for example has a character width of 2, and so does , but this PR has a horizontal advance of 13 whereas is 21 (using Ubuntu Mono pt16, a font we include).

So horizontal advance is perhaps not the correct number to use in 100% of the cases? It works for emoji's since we want them to be bigger than normal text, but not for these characters which we'd like to be actually monospace.

@SoulSpirit79
Copy link
Copy Markdown
Author

Asian servers should not use 0x20 for formatting text, but a wider whitespace character

That's not an option, we can't break rendering and tell them to fix it even if we might be right 😕

To solve this, only when GBK encoding is used, I'd suggest to convert every 0x20 character to 0x3000 when filling the lineBuffer. Could it work?

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Let's think on a better solution - exceptions are bound to fail. See what I mentioned above in regards to character width - it is not just related to the space, it is related to the different way we are measuring now which works for emoji's, but breaks asian text.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Thanks so much for working on this! It looks like it'll be harder than expected, but it was to happen either way when we'd get around to this.

@SoulSpirit79
Copy link
Copy Markdown
Author

Yeah, there is no special treatment. In current code for example has a character width of 2, and so does , but this PR has a horizontal advance of 13 whereas is 21 (using Ubuntu Mono pt16, a font we include).

So horizontal advance is perhaps not the correct number to use in 100% of the cases? It works for emoji's since we want them to be bigger than normal text, but not for these characters which we'd like to be actually monospace.

Like @SlySven said, that problem rises when the selected font (Ubuntu Mono in your case) hasn't the glyph you requested. In this case the OS renders the character using another font, which could have different metrics.
Ubuntu Mono doesn't define asian glyphs: https://fonts.google.com/specimen/Ubuntu+Mono.

According to the documentation (https://doc.qt.io/qt-5/qfontmetrics.html#horizontalAdvance-1) horizontalAdvance seems to be the best property to calculate where to draw the next glyph, and it replaced width (now deprecated).
The docs rise a warning about arabic characters and non-spacing marks though, as these glyph are drawn through the use of multiple characters: https://www.oreilly.com/library/view/unicode-demystified/0201700522/0201700522_ch04lev1sec1.html

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Sorry, can you explain what you mean?

Ubuntu Mono doesn't have emoji's either, they are 100% certainly drawn using the Noto font fallback I've added - and you're implying that horizontalAdvance() isn't correct for fallback characters: Asian and emojis?

@SoulSpirit79
Copy link
Copy Markdown
Author

Sorry, can you explain what you mean?

Ubuntu Mono doesn't have emoji's either, they are 100% certainly drawn using the Noto font fallback I've added - and you're implying that horizontalAdvance() isn't correct for fallback characters: Asian and emojis?

I think horizontalAdvance works with the real glyph being rendered, and that's why it works.
You can see it works by the screenshots I've posted: glyphs never overlap.
For example, with Bitstream Vera Sans Mono I tested, horizontalAdvance() returns 8 for any latin glyph, and 18 for the glyph 🤮, which is not included in the font. This implies that the function is measuring the glyph width on the fallback font.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

Oh, I see what you mean now. You basically turned off monospace rendering of characters - which is why emoji's work - because they aren't drawn monospace anymore and bigger than normal - and other things don't, like special characters - because they aren't drawn monospace anymore, either - they are drawn smaller than intended.

That won't work, we can't fix one thing and break another.

How do other applications draw emoji's when using monospace fonts? We still have to stick with monospace, not monospace as you've seen just breaks a lot of things.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 7, 2019

What if we treat emoji's as 2 width characters and draw them in the middle of the space we have for 2 characters? Would that look acceptable, without breaking all text display in Mudlet?

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 22, 2019

image

Seems good now, this other approach works well too.

@vadi2 vadi2 closed this Aug 22, 2019
@SoulSpirit79
Copy link
Copy Markdown
Author

image

Seems good now, this other approach works well too.

Sorry if I'm a bit late, but I was out for vacation.
I'm running the latest development commit and I don't see improvements for emoji support:
image

I checked the changes in the code made in the last weeks and I've found nothing that could have fixed the problem. I cannot understand how you can obtain the result highlighted by your screenshot.

widechar_width.h is still incomplete, as it is based on unicode 5 table.
Plus, it has that problem where unicode character beeing looked up is converted from uint to wchar_t which, on my system, is signed.

For example, taking the character 😂 0x1F602 the call to function widechar_wcwidth:

switch (widechar_wcwidth(unicode)) {

... converts the character to 0xFFFFF602 as it reaches that library:
int widechar_wcwidth(wchar_t c) {

A couple of screenshots from the debugger:
image

image

@vadi2 vadi2 reopened this Aug 26, 2019
@SoulSpirit79
Copy link
Copy Markdown
Author

Thanks for reopening the issue.
Following what you said in the thread, I'd suggest to:

As I said, I'm not confortable on reporting the issue on the ridiculousfish project because I don't fully understand what's behind the wchar_t type, but I'll do it if anyone agrees it actually is the issue.

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Aug 28, 2019

I am not sure that just because the file being referred is https://unicode.org/Public/emoji/5.0/emoji-data.txt that it IS specific to Unicode version 5.0 - if you look in that file it was last updated a couple of years ago and indicates some glyphs that were introduced in version 10.0. It also defines ranges of code-points and it is possible that additions happened within those ranges which need not alter the content of the file (although I am not entirely sure about that).

Certainly when I updated the code in 67449e5 it seems that later Unicode data was being used from the other two files used to generate widechar_width.h ...

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Aug 28, 2019

Of course I could be wrong 🤦‍♂ - there are differences between https://unicode.org/Public/emoji/5.0/emoji-data.txt and https://unicode.org/Public/emoji/12.0/emoji-data.txt - so apologies to you @SoulSpirit79 ...

Lets see if upstream pick up on this: ridiculousfish/widecharwidth#5

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Sep 14, 2019

After ridiculousfish/widecharwidth#6 (comment), I think we should draw emoji's as 2x width. It'll help not screw up the formatting of everything, so monospaced fonts still stay monospace, and our emoji's won't be getting cut off.

@vadi2 vadi2 self-assigned this Oct 12, 2019
@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Oct 12, 2019

@SoulSpirit79 sorry, this kind of fell off the radar - but I'll look into picking this work up and finishing it, if you or nobody else can't. I think we should draw emoji's as 2x width and still keep the display consistent: making it inconsistent is just doing to break a lot of things and people would also be unhappy about the formatting.

@SoulSpirit79
Copy link
Copy Markdown
Author

@SoulSpirit79 sorry, this kind of fell off the radar - but I'll look into picking this work up and finishing it, if you or nobody else can't. I think we should draw emoji's as 2x width and still keep the display consistent: making it inconsistent is just doing to break a lot of things and people would also be unhappy about the formatting.

I agree with your point of view, but we have to choose a strategy to make emojis work.
Fixing data types in widechar_width.h (from signed to unsigned ints as I did in my first commit) would help 'cause it'd make unicode 5 emojis work as expected. But widechar_width.h comes from an external library, so we should agree on modifying it or contact ridiculousfish to make him fix it.

For a definitive solution instead, there are various points to consider:

  • you can't say a character is or isn't an emoji without a lookup table. The only source of truth is https://unicode.org/Public/emoji/12.0/emoji-data.txt . If ridiculousfish can't handle that (because its project isn't made to handle only emojis but has many other factors to consider), we should find an alternative.
  • We actually don't support unicode variations. A typical example is the ok symbol 👌 which could be "modified" in many ways like 👌🏿 or 👌🏻. A more complex example is 🕵 which uses the ZERO WIDTH JOINER for its variations 🕵️‍♂️ (male) and 🕵️‍♀️ (female). What should we do with that?

If this was my project, I'd move this way, in order:

  • implement the quick hack about data types on widechar_width.h, and release it
  • implement a custom handler to deal with common characters (ASCII and perhaps something more), to be called first in TTextEdit::getGraphemeWidth function to improve speed
  • implement a custom handler for emojis listed on unicode.org (which could be improved over time, i.e. for emoji variations) to be called just after
  • fallback on widechar_width.h for every other character.

What do you think?

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Oct 30, 2019

Um, sorry for not responding to this before but I've finally run into part of this whilst doing some work on a Windows 64 bit build - it turns out that the primary problem with widechar_wcwidth(...) is that it was designed with an argument for the character to look up being a wchar_t - and the problem with that is that on Windows it is only 16 bits in size! This was completely screwing up the results for non-BMP characters as those need more than 16 bits to contain their value. For me this meant I was getting perfectly valid Emoji codepoints being reported as private use ones - which were only being given a width of one normal width character when they should have had two...!

Anyhow I have referred a fix (well it worked for me) for this upstream as ridiculousfish/widecharwidth#7 - I intend to incorporate those changes into our codebase and see what effect that has on the wrong widths particularly on Windows...

@SoulSpirit79 Thanks for reporting the issue, even if the solution we adopt is not this one.

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Oct 30, 2019

As for the variations - we consider the whole grapheme at a time so the width is pretty much dependent on the classification of the first (the base) codepoint {though the Fitzpatrick modifiers might mess with that if the font renderer does not understand them and adopts the fall-back of a colour or grey scale square}; another consideration ought to be the text/emoji variation selectors - but we are only just getting the full-color emoji stuff working - and the Qt libraries totally fail to handle those at all as far as I can tell...

@SoulSpirit79
Copy link
Copy Markdown
Author

Anyhow I have referred a fix (well it worked for me) for this upstream as ridiculousfish/widecharwidth#7 - I intend to incorporate those changes into our codebase and see what effect that has on the wrong widths particularly on Windows...

Thanks @SlySven , much appreciated :)

SlySven added a commit to SlySven/Mudlet that referenced this pull request Oct 30, 2019
This is a bugfix that I got into upstream and fixes some issues on Windows
where some Emojis (off of the BMP) were not being correctly typed - they
were being identified as private use characters on Windows because the
argument to a function was not wide enough (did not have enough bits) to
convay the values required.

I believe this may be a simpler fix for the issue that PR:
Mudlet#2943 was trying to fix.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
SlySven added a commit that referenced this pull request Oct 31, 2019
This is a bugfix that I got into upstream and fixes some issues on Windows
where some Emojis (off of the BMP) were not being correctly typed - they
were being identified as private use characters on Windows because the
argument to a function was not wide enough (did not have enough bits) to
convay the values required.

I believe this may be a simpler fix for the issue that PR:
#2943 was trying to fix.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
@SlySven
Copy link
Copy Markdown
Member

SlySven commented Oct 31, 2019

@SoulSpirit79 How does the test versions in #3207 look compared to the situation you were trying to fix in this PR - i.e. how do they compare to the before this PR case? I suspect that the widths will be better but you might still see some shaving off the right hand side of some of those emojis...

@SoulSpirit79
Copy link
Copy Markdown
Author

@SoulSpirit79 How does the test versions in #3207 look compared to the situation you were trying to fix in this PR - i.e. how do they compare to the before this PR case? I suspect that the widths will be better but you might still see some shaving off the right hand side of some of those emojis...

The result is really good, even if all emojis are shaved off, here's a screenshot of the #3207 build on Win10 64:
image

Thumbs up :)

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Dec 14, 2019

What should we do next with this? See how emoji's render as in 2 character width, to observe if the shaving is gone?

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Dec 14, 2019

Saying the emoji's render as 2 characters wide is sadly a bit misleading and mono-space-ist! Unfortunately when more than one font is used there is no guarantee that even QFont::averageCharWidth() (or whatever the method was) will cover every case. That will cause the shaving as I called it - as the space we would allow for the "character width" will not prove to be enough for those emojis and the following grapheme will not be painted far enough to the right to clear the right side of the emoji.

Ultimately we may have to check the width of each character when painted and then divide that by how many "normal" character spaces we think it should be and then reset our idea of what the "character width" to use is for every grapheme still in the TBuffer should that exceed the previously used value - and repaint everything with the new value.

Oh, and we also must ensure that kerning is switched off - it is not appropriate for a mone/duospaced text layout. OTOH once we do the above we can easily (IMHO) allow the use of proportional fonts - we will just render them in a monospaced manner...!

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Dec 14, 2019

I think we should explore sticking with mono-space-ist as we have been, because I suspect it's pretty vital to how MUDs work, especially for the text formatting. If we throw that out I think it'll make a mess. That's why I want to see how emoji's look look like if we try them as 2 space-characters, do they fit in then or no...

@SoulSpirit79
Copy link
Copy Markdown
Author

What should we do next with this? See how emoji's render as in 2 character width, to observe if the shaving is gone?

There's a possible workaround to mitigate the shaving: rendering every character with transparent background.

Let me explain. This is how the emoji 😡 is rendered now by Mudlet on Win10:
image
As you see, if the emoji is the last charater of the line everything goes well.
If the emoji (which is a bit larger than 2 characters) is followed by another character (even if it is a space) it suffers from shaving.

I don't know if transparent background would affect performance, but it could be a solution. The worse it could happen is that 2 characters partially overlap, but I think that's better than the shaving.

If this solution affects performance, we could limit the transparency to white spaces. Emojis aren't part of any word thus the common case is to separate it with a space from the rest of the text, like we do in our mud:
image

What do you think?

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Dec 18, 2019

Transparent backgrounds are a non-starter I think - remember that MUD text has foreground and background colours that are BOTH adjustable (except that foreground colours do not apply to colour emojis - on OS Versions that support them - i.e. not Windows 7 or FreeBSD).

And there is no certainty that emojis are transmitted with spaces padding them, either from other characters (or more likely in my opinion, from each other...:roll_eyes::disappointed:!)

dicene pushed a commit to dicene/Mudlet that referenced this pull request Feb 19, 2020
This is a bugfix that I got into upstream and fixes some issues on Windows
where some Emojis (off of the BMP) were not being correctly typed - they
were being identified as private use characters on Windows because the
argument to a function was not wide enough (did not have enough bits) to
convay the values required.

I believe this may be a simpler fix for the issue that PR:
Mudlet#2943 was trying to fix.

Signed-off-by: Stephen Lyons <slysven@virginmedia.com>
@tsathoqqua
Copy link
Copy Markdown

Just encountered this issue in April 2020, hoping for a resolution and some .cpp conflict resolution :)

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Apr 13, 2020

From what I remember this PR fixes one thing and breaks the other. I'd be happy to see it fixed!

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Aug 19, 2024

Closing for now as we haven't had any movement on this, but please do re-open if you'd like to work on it again.

@vadi2 vadi2 closed this Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants