Skip to content

Proposal: Make "screenchar()" handle unicode composing characters#4059

Closed
ichizok wants to merge 7 commits intovim:masterfrom
ichizok:feature/mbyte-screenchar
Closed

Proposal: Make "screenchar()" handle unicode composing characters#4059
ichizok wants to merge 7 commits intovim:masterfrom
ichizok:feature/mbyte-screenchar

Conversation

@ichizok
Copy link
Contributor

@ichizok ichizok commented Mar 1, 2019

I propose to add the following features for handling unicode composing characters.

char2nr(expr [, utf8 [, list]])
str2list(expr [, utf8])

  • e.g. str2list("\uXXXX\uYYYY\uZZZZ", 1) == [0xXXXX, 0xYYYY, 0xZZZZ] (\uYYYY is a composing character)

nr2char(expr [, utf8])
list2str(list [, utf8])

  • e.g. list2str([0xXXXX, 0xYYYY, 0xZZZZ], 1) == "\uXXXX\uYYYY\uZZZZ" (\uYYYY is a composing character)

screenchar(row, col [, list])

  • Add a new optional argument "list": when set it to 1 the result is a List of number values
  • e.g. screenchar(1, 1, 1) == [0xXXXX, 0xYYYY] (\uYYYY is a composing character)

src/evalfunc.c Outdated
return;
for (; *p != NUL; p += utf_ptr2len(p))
list_append_number(rettv->vval.v_list,
(varnumber_T)utf_ptr2char(p));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, the returned value will include all characters of the given string - not only the first one plus composing characters. If this is really what we wanit, I think it would be nice to warn about this more clearly in the documentation.

src/evalfunc.c Outdated
c = tv_get_number(&li->li_tv);
if (i > 0 && !utf_iscomposing(c))
break;
len += utf_char2bytes(c, &buf[len]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, nr2char() stops on encountering a character that is not composing, but char2nr(str, 1, 1) does not stop in such a situation. IOW, I am not sure if it's good to have an asymetry: char2nr("AB", 1, 1) gives [65, 66], but nr2char([65, 66], 1) gives only "A".

When {expr} is a List of number values, and with 'encoding' is
"utf-8" or {utf8} set to 1, return Unicode character which is
the result of combining the values in {expr} as Unicode
codepoints up to 'maxcombline'.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/maxcombline/maxcombine/

@brammool
Copy link
Contributor

brammool commented Mar 2, 2019 via email

@ichizok ichizok force-pushed the feature/mbyte-screenchar branch from 44930dd to a17f2c3 Compare March 2, 2019 23:31
@ichizok
Copy link
Contributor Author

ichizok commented Mar 2, 2019

Updated the patch:

  • Revert "char2nr()" "nr2char()"
  • Add "list2str()" "str2list()"

@ichizok
Copy link
Contributor Author

ichizok commented Mar 2, 2019

@brammool

It would be good to also have the option to get the string, which
represents the character, and any composing characters. This is
especially useful when getting the character for several positions
and concatenating them.

What interface do you expect? I want an example.

@codecov-io
Copy link

codecov-io commented Mar 3, 2019

Codecov Report

Merging #4059 into master will decrease coverage by <.01%.
The diff coverage is 88.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4059      +/-   ##
==========================================
- Coverage   79.21%    79.2%   -0.01%     
==========================================
  Files         105      105              
  Lines      141147   141199      +52     
==========================================
+ Hits       111813   111842      +29     
- Misses      29334    29357      +23
Impacted Files Coverage Δ
src/evalfunc.c 88.63% <88.33%> (-0.04%) ⬇️
src/if_xcmdsrv.c 83.48% <0%> (-0.72%) ⬇️
src/window.c 83.44% <0%> (-0.2%) ⬇️
src/gui.c 58% <0%> (-0.16%) ⬇️
src/ex_cmds2.c 84.89% <0%> (-0.1%) ⬇️
src/terminal.c 79.04% <0%> (-0.09%) ⬇️
src/screen.c 80.3% <0%> (-0.03%) ⬇️
src/gui_gtk_x11.c 48.32% <0%> (+0.04%) ⬆️
src/channel.c 83.22% <0%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e21c158...e4258eb. Read the comment docs.

@mattn
Copy link
Member

mattn commented Mar 4, 2019

How about chars ? chars2str , str2chars

@mattn
Copy link
Member

mattn commented Mar 4, 2019

Or codes2str, str2codes

@ichizok
Copy link
Contributor Author

ichizok commented Mar 5, 2019

I think:
"chars" looks "list of characters" e.g. ['a', 'b', 'c'] (not numbers).
"codes" is more accurate expression but may not clear for users.

@brammool brammool closed this in 2912abb Mar 29, 2019
@ichizok ichizok deleted the feature/mbyte-screenchar branch March 29, 2019 14:06
lewis6991 pushed a commit to lewis6991/neovim that referenced this pull request Dec 12, 2021
Problem:    Cannot get composing characters from the screen.
Solution:   Add screenchars() and screenstring(). (partly by Ozaki Kiichi,
            closes vim/vim#4059)
vim/vim@2912abb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants