Improve keybinding parsing for Unicode support by JustForFun88 · Pull Request #14020 · nushell/nushell

JustForFun88 · 2024-10-07T05:38:20Z

Description

This pull request enhances the add_parsed_keybinding function to provide greater flexibility in specifying keycodes for keybindings in Nushell. Previously, the function only supported specifying keycodes directly through character notation (e.g., char_e for the character e). This limited users to a small set of keybindings, especially in scenarios where specific non-English characters were needed.

With this new version, users can also specify characters using their Unicode codes, such as char_u003B for the semicolon (;), providing a more flexible approach to customization, for example like this:

{
    name: move_to_line_end_or_take_history_hint
    modifier: shift
    keycode: char_u003B # char_;
    mode: vi_normal
    event: {
        until: [
            { send: historyhintcomplete }
            { edit: movetolineend }
        ]
    }
}

Motivation:

This enhancement allows better customization of vi mode in Nushell. For example, in the Russian layout, the $ symbol (used for 'Move to end of line') coincides with the semicolon (;). Previously, it was not possible to assign char_; as the keycode, as the parser did not recognize it as a valid Unicode character.

This enhancement extends support for specifying keycodes using Unicode character codes (e.g., char_u003B for ;), allowing users to more easily configure Nushell according to their preferred keyboard layouts and enhancing the overall flexibility of keybinding customization.

The new implementation preserves the performance of the old one and avoids redundant checks (c.starts_with("char_") and c.chars().skip(5)).

User-Facing Changes

Added support for specifying keycodes using Unicode codes, e.g., char_u002C (comma - ,):

{
    name: <command_name>, # name of the command
    modifier: none,       # key modifier
    keycode: char_u002C,  # Unicode code for the comma (',')
    mode: vi_normal,      # mode in which this binding should work
    event: {
        send: <action>    # action to be performed
    }
}

Enhance keybinding parsing to support Unicode character codes: - Added support for specifying keycodes using Unicode codes (e.g., char_u003B for `;`). - Improved flexibility for customizing vi mode keybindings, especially for non-English keyboard layouts.

fdncred · 2024-10-07T11:54:12Z

Thanks for trying to help out. I haven't studied your code yet, but I like the idea of expanding this to support uXXXX. I always thought char_X was a bit limiting. Just one question below.

crates/nu-cli/src/reedline_config.rs

sholderbach · 2024-10-07T12:33:36Z

I don't quite follow the example in your motivation.

You can define a binding to ; with the existing config logic as well, the limitation is only imposed by the unquoted string, a ; semicolon will act as a statement separator so it only sees the char_.

(we may want to check our general parser/grammar if the behavior of ; in the record context is proper)

       {
            name: history_menu
            modifier: none
            keycode: "char_;"
            mode: [emacs, vi_insert, vi_normal]
            event: { send: menu name: history_menu }
        }

One general limitation with some of those special characters is that the ANSI default keyboard protocol will not always carry the modifier info with some of those. What gets transmitted from the terminal is generally really the text. While on a QWERTY keyboard it would correctly show without modifier as found on the layout, with the Russian keyboard it probably won't show the SHIFT-4 as it doesn't show a SHIFT for German QWERTZ (here ; is in the shifted position of ,)

JustForFun88 · 2024-10-07T13:09:31Z

I don't quite follow the example in your motivation.

You can define a binding to ; with the existing config logic as well, the limitation is only imposed by the unquoted string, a ; semicolon will act as a statement separator so it only sees the char_.
One general limitation with some of those special characters is that the ANSI default keyboard protocol will not always carry the modifier info with some of those. What gets transmitted from the terminal is generally really the text. While on a QWERTY keyboard it would correctly show without modifier as found on the layout, with the Russian keyboard it probably won't show the SHIFT-4 as it doesn't show a SHIFT for German QWERTZ (here ; is in the shifted position of ,)

Oh, I didn't know you could assign keys like "char_;". That's definitely cool 😊. On the other hand, I think adding uXXXX won't be much of an inconvenience, and it allows, for example, separate categorization of uppercase and lowercase Latin letters. Although the latter is probably not very useful since switching is done via SHIFT.

And you are right that modifier: none + keycode: "char_;" or modifier: none + keycode: char_u003B doesn't work. I found experimentally that you need to set modifier: shift for this to work 😊

fdncred · 2024-10-07T13:12:02Z

IIRC, I think the issue was less about char_; and more about some special unicode character that aren't on english keyboards, although it is also about char_;. Seems like the discussion for this was on discord somewhere but it was like a double u character now W but char_uu but one character. Maybe it would be better to have that as an example.

Found it: https://discord.com/channels/601130461678272522/614593951969574961/1292426188765265951

{
    name: enter_vi_insert_mode
    modifier: none
    keycode: char_ш
    mode: vi_normal
    event: { *Some magic to change to vi_insert ??* }
}

JustForFun88 · 2024-10-07T13:29:04Z

IIRC, I think the issue was less about char_; and more about some special unicode character that aren't on english keyboards, although it is also about char_;. Seems like the discussion for this was on discord somewhere but it was like a double u character now W but char_uu but one character. Maybe it would be better to have that as an example.

Found it: https://discord.com/channels/601130461678272522/614593951969574961/1292426188765265951
{
    name: enter_vi_insert_mode
    modifier: none
    keycode: char_ш
    mode: vi_normal
    event: { *Some magic to change to vi_insert ??* }
}

Oh, that's a different issue. I have some thoughts about adding a separate variant in EventType, like EventType::Command, and putting something like Command::EnterViInsert there. But I'm not sure yet. Another option would be to have KeybindingsMode store all three: emacs_keybindings, insert_keybindings, normal_keybindings, so that the Nushell mode can be changed on the fly.
If, of course, the Nushell team considers any of these changes acceptable.

fdncred · 2024-10-07T13:35:54Z

I know getting insert mode to work is a different issue but i thought char_ш wouldn't work either, just as a key binding?

sholderbach · 2024-10-07T13:39:15Z

Found it: discord.com/channels/601130461678272522/614593951969574961/1292426188765265951

{
    name: enter_vi_insert_mode
    modifier: none
    keycode: char_ш
    mode: vi_normal
    event: { *Some magic to change to vi_insert ??* }
}

Here the problem seems more towards what can be mapped and not the char_ dispatch. We permit a single char (so anything that is one Unicode codepoint, not joined) as this the unit that we can decode via crossterm.
From a readability and usability perspective I personally prefer that this binding is WYSIWYG. If you can generate character in the config with your keyboard you most likely can trigger the key via crossterm. One argument I can see is one might disambiguate via the Unicode id if you meant the U+003B ; semicolon or the confusable Greek question mark U+037E ; (but not sure what the keyboards/terminals do there)
The cyrillic alphabet as found on the keyboard also produces single codepoint chars e.g. й is its own char (2-bytes in UTF-8) and not some joined grapheme by combining и and some grave diacritic mark.

Our vi mode parser itself is not very advanced at the moment so doesn't permit remapping of anything in the state machine for the normal mode key sequences. So if you :noremap your way out of the incompatibilities of vim with non-QWERTY keyboards, we can't provide an easy way out at the moment.

JustForFun88 · 2024-10-07T13:42:52Z

I know getting insert mode to work is a different issue but i thought char_ш wouldn't work either, just as a key binding?

No, just like the key binding char_ш works, there are no problems with that.

fdncred · 2024-10-07T13:45:06Z

No, just like the key binding char_ш works, there are no problems with that.

ok, my mistake. so, this PR is really about char_; alone then. still seems reasonable to me.

JustForFun88 · 2024-10-07T13:59:00Z

Here the problem seems more towards what can be mapped and not the char_ dispatch. We permit a single char (so anything that is one Unicode codepoint, not joined) as this the unit that we can decode via crossterm. From a readability and usability perspective I personally prefer that this binding is WYSIWYG. If you can generate character in the config with your keyboard you most likely can trigger the key via crossterm.

I have to disagree on one point: the current implementation doesn't distinguish between uppercase and lowercase Latin characters, meaning you can't assign different actions to A and a, though this is possible for non-Latin characters. Therefore, WYSIWYG isn't fully achieved for Latin characters 🙂.

sholderbach · 2024-10-08T12:36:14Z

Therefore, WYSIWYG isn't fully achieved for Latin characters 🙂.

Yeah that lazy to_ascii_lowercase makes it a bit lax. In the history of crossterm there was a time where the capitalization of the shifted characters was less clearly defined so we probably went with that. But for all the ASCII alphabetical characters at least shift is properly detected (in contrast to the Ctrl versions)

sholderbach

I think I made clear that I am not burning for the feature, but the implementation is sound and it can possibly disambiguate characters on international keyboards. I would not recommend that we use any of those char_u... expressions in the default config and it creeps up the number of things a user may get wrong in their config.

But let's give it a try and make sure that we have some documentation on it.

fdncred · 2024-10-08T13:51:16Z

@JustForFun88 it would be good to have some documentation on this change in the upcoming release blog if you have time? here's the current PR that people contribute things like this to nushell/nushell.github.io#1566

JustForFun88 · 2024-10-08T14:05:29Z

@JustForFun88 it would be good to have some documentation on this change in the upcoming release blog if you have time? here's the current PR that people contribute things like this to nushell/nushell.github.io#1566

I think that I can prepare PR tomorrow

fdncred reviewed Oct 7, 2024

View reviewed changes

crates/nu-cli/src/reedline_config.rs Show resolved Hide resolved

sholderbach approved these changes Oct 8, 2024

View reviewed changes

sholderbach merged commit 55c3fc9 into nushell:main Oct 8, 2024

hustcer added this to the v0.99.0 milestone Oct 8, 2024

JustForFun88 deleted the nushell-keybinding-unicode branch October 8, 2024 14:14

JustForFun88 mentioned this pull request Oct 9, 2024

Added changes to release notes for version 0.99.0 due to nushell/#14020 nushell/nushell.github.io#1583

Merged

fdncred added the notes:mention Include the release notes summary in the "Hall of Fame" section label Oct 9, 2024

This was referenced Oct 10, 2024

Refactor config updates #13802

Merged

Introduce footer_inheritance option #14070

Merged

Conversation

JustForFun88 commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation:

User-Facing Changes

Uh oh!

fdncred commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sholderbach commented Oct 7, 2024

Uh oh!

JustForFun88 commented Oct 7, 2024

Uh oh!

fdncred commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JustForFun88 commented Oct 7, 2024

Uh oh!

fdncred commented Oct 7, 2024

Uh oh!

sholderbach commented Oct 7, 2024

Uh oh!

JustForFun88 commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fdncred commented Oct 7, 2024

Uh oh!

JustForFun88 commented Oct 7, 2024

Uh oh!

sholderbach commented Oct 8, 2024

Uh oh!

sholderbach left a comment

Choose a reason for hiding this comment

Uh oh!

fdncred commented Oct 8, 2024

Uh oh!

JustForFun88 commented Oct 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JustForFun88 commented Oct 7, 2024 •

edited

Loading

fdncred commented Oct 7, 2024 •

edited

Loading

fdncred commented Oct 7, 2024 •

edited

Loading

JustForFun88 commented Oct 7, 2024 •

edited

Loading