[Linux] Fix CJK input #31813

cbracken · 2022-03-03T23:36:12Z

Fixes an issue with CJK IMEs wherein a text input state update may be
sent to the framework that misleads the framework into assuming that IME
composing has ended.

As an example, when inputting Korean text, characters are built up keystroke by
keystroke until the point that either:

the user presses space/enter to terminate composing and commit the
character, or;
the user presses a key such that the character currently being
composed cannot be modified further, and the IME determines that the
user has begun composing the next character.

The following is an example sequence of events for the latter case:

User presses ㅂ. Begin compose event followed by change event
received with ㅂ. Embedder sends state update to framework.
User presses ㅏ. im_preedit_changed_cb with 바. Embedder sends state
update to framework.
User presses ㄴ. im_preedit_changed_cb with 반. Embedder sends state
update to framework.
User presses ㅏ. At this point, the current character being composed
(반) cannot be modified in a meaningful way, and the IME determines
that the user is typing 바 followed by 나. im_commit_cb received with
바, immediately followed by im_preedit_changed event with 나.

In step 4, we previously sent two events to the framework, one
immediately after the other:

im_commit_cb triggers the text input model to commit the current
composing region to the string under edit. This causes the composing
region to collapse to an empty range.
im_preedit_change_cb triggers the text input model to insert the new
composing character and set the composing region to that character.

Conceptually, this is an atomic operation. The fourth keystroke causes
the 반 character to be broken into two (바 and ㄴ) and the latter to be
modified to 나. From the user's point of view, as well as from the IME's
point of view, the user has NOT stopped composing, and the composing
region has simply moved on to the next character.

Flutter has no concept of whether the user is composing or not other
that whether a non-empty composing region exists. As such, sending a
state update after the commit event misleads the framework into
believing that composing has ended. This triggers a serious bug:

Text fields with input formatters applied do not perform input
formatting updates while composing is active; instead they wait until
composing has ended to apply any formatting. The previous behaviour
would thus trigger input formatters to be applied each time the user
input caused a new character to be input. This has the add-on negative
effect that once formatting has been applied, it sends an update back to
the embedder so that the native OS text input state can be updated.
However, since the commit event is immediately followed by a
preedit change, the state has changed in the meantime, and the embedder
is left processing an update (the intermediate state sent after the
commit) which is now out of date (i.e. missing the new state from the
change event).

The source of this bug is as follows:

Commit event for a character/compose region is sent from the engine.
The engine TextInputModel still models its composing field as true.
An update is sent to the framework with the committed text and an
empty composing range such as (1, 1). Note that the engine previously
only sends a range of (-1, -1) when composing has ended, NOT just when
it has an empty composing region.
Framework receives commit event and updates the text to match. The
framework does not model the system composing state; instead its
understanding of whether the user is composing or not is entirely
predicated on whether the composing region is empty or not. If it is,
it triggers input formatters, which in this case have no effect on the
text/selection. However, the framework consistently models empty
compose regions as (-1, -1) and resets the text editing value as such.
Because the framework triggered a change to the TextEditingValue, it
dutifully sends the update back to the engine.
In the meantime, in parallel with the above step, the engine starts
processing the change event immediately following the commit, and
updates the text and composing region with the next character. This
change is promptly stomped on by the incoming framework update.

To avoid this, we have the engine consistently send empty compose
regions as (-1, -1) to the framework. After the input formatter is
applied on commit, the compose region is still (-1, -1) and there are
therefore no diffs, and the framework will not send an update back to
the engine and stomp on any new state on the engine side.

Longer-term, we really should add some form of versioning information to
the text edit protocol so as to detect and resolve conflicts rather than
relying entirely on not creating races in the first place.

This bug was revealed by flutter/flutter#90211
which applies an input formatter to single-line text fields in order to
suppress newlines.

Issue: flutter/flutter#97174

Pre-launch Checklist

I read the Contributor Guide and followed the process outlined there for submitting PRs.
I read the Tree Hygiene wiki page, which explains my responsibilities.
I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
I listed at least one issue that this PR fixes in the description above.
I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on
writing and running engine tests.
I updated/added relevant documentation (doc comments with ///).
I signed the CLA.
All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

flutter-dashboard · 2022-03-03T23:36:16Z

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

Fixes an issue with CJK IMEs wherein a text input state update may be sent to the framework that misleads the framework into assuming that IME composing has ended. As an example, when inputting Korean text, characters are built up keystroke by keystroke until the point that either: * the user presses space/enter to terminate composing and commit the character, or; * the user presses a key such that the character currently being composed cannot be modified further, and the IME determines that the user has begun composing the next character. The following is an example sequence of events for the latter case: 1. User presses ㅂ. Begin compose event followed by change event received with ㅂ. Embedder sends state update to framework. 2. User presses ㅏ. im_preedit_changed_cb with 바. Embedder sends state update to framework. 3. User presses ㄴ. im_preedit_changed_cb with 반. Embedder sends state update to framework. 4. User presses ㅏ. At this point, the current character being composed (반) cannot be modified in a meaningful way, and the IME determines that the user is typing 바 followed by 나. im_commit_cb received with 바, immediately followed by im_preedit_changed event with 나. In step 4, we previously sent two events to the framework, one immediately after the other: * im_commit_cb triggers the text input model to commit the current composing region to the string under edit. This causes the composing region to collapse to an empty range. * im_preedit_change_cb triggers the text input model to insert the new composing character and set the composing region to that character. Conceptually, this is an atomic operation. The fourth keystroke causes the 반 character to be broken into two (바 and ㄴ) and the latter to be modified to 나. From the user's point of view, as well as from the IME's point of view, the user has NOT stopped composing, and the composing region has simply moved on to the next character. Flutter has no concept of whether the user is composing or not other that whether a non-empty composing region exists. As such, sending a state update after the commit event misleads the framework into believing that composing has ended. This triggers a serious bug: Text fields with input formatters applied do not perform input formatting updates while composing is active; instead they wait until composing has ended to apply any formatting. The previous behaviour would thus trigger input formatters to be applied each time the user input caused a new character to be input. This has the add-on negative effect that once formatting has been applied, it sends an update back to the embedder so that the native OS text input state can be updated. However, since the commit event is immediately followed by a preedit change, the state has changed in the meantime, and the embedder is left processing an update (the intermediate state sent after the commit) which is now out of date (i.e. missing the new state from the change event). The source of this bug is as follows: * Commit event for a character/compose region is sent from the engine. The engine TextInputModel still models its `composing` field as true. An update is sent to the framework with the committed text and an empty composing range such as (1, 1). Note that the engine previously only sends a range of (-1, -1) when composing has ended, NOT just when it has an empty composing region. * Framework receives commit event and updates the text to match. The framework does not model the system composing state; instead its understanding of whether the user is composing or not is entirely predicated on whether the composing region is empty or not. If it is, it triggers input formatters, which in this case have no effect on the text/selection. However, the framework consistently models empty compose regions as (-1, -1) and resets the text editing value as such. Because the framework triggered a change to the TextEditingValue, it dutifully sends the update back to the engine. * In the meantime, in parallel with the above step, the engine starts processing the change event immediately following the commit, and updates the text and composing region with the next character. This change is promptly stomped on by the incoming framework update. To avoid this, we have the engine consistently send empty compose regions as (-1, -1) to the framework. After the input formatter is applied on commit, the compose region is still (-1, -1) and there are therefore no diffs, and the framework will not send an update back to the engine and stomp on any new state on the engine side. Longer-term, we really should add some form of versioning information to the text edit protocol so as to detect and resolve conflicts rather than relying entirely on not creating races in the first place. This bug was revealed by flutter/flutter#90211 which applies an input formatter to single-line text fields in order to suppress newlines. Issue: flutter/flutter#97174

justinmc

LGTM 👍

Thanks for the detailed explanation in the description. Reading that, are you saying that the framework responds to collapsed composing regions like (1,1) by sending an update back to the engine with the composing region as (-1,-1)? Is that something we could fix in the framework instead of here?

cbracken · 2022-03-04T19:39:12Z

In theory we could land a fix for this in the framework, though it would touch a whole lot more code and potentially be more fragile long term. I do think we want some kind of solution other than relying on (-1, -1) being a signal that the composing range is invalid. Checking collapsed everywhere on the framework side instead might be a safer bet.

For now, this is probably the smallest change that gets the linux embedder in line with the framework.

cbracken added the Work in progress (WIP) Not ready (yet) for review! label Mar 3, 2022

flutter-dashboard bot added the platform-linux label Mar 3, 2022

flutter-dashboard bot added the needs tests label Mar 3, 2022

cbracken requested review from justinmc and robert-ancell March 3, 2022 23:36

justinmc approved these changes Mar 4, 2022

View reviewed changes

cbracken merged commit 0aa1d3a into flutter:main Mar 4, 2022

cbracken deleted the text-input-model-dirty branch March 4, 2022 19:43

cbracken mentioned this pull request Mar 4, 2022

Linux: CJK text input broken on single-line textfields flutter/flutter#97174

Closed

engine-flutter-autoroll mentioned this pull request Mar 4, 2022

Roll Engine from 2f819b16e6bb to 5e9a906b3c44 (5 revisions) flutter/flutter#99569

Closed

engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Mar 4, 2022

0aa1d3a [Linux] Fix CJK input (flutter/engine#31813)

9f6d9dd

engine-flutter-autoroll mentioned this pull request Mar 4, 2022

Roll Engine from 2f819b16e6bb to 42212138dfd0 (8 revisions) flutter/flutter#99571

Merged

engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Mar 4, 2022

0aa1d3a [Linux] Fix CJK input (flutter/engine#31813)

297898a

cbracken mentioned this pull request May 11, 2022

[auto_submit] Include commit description when auto-merging PRs flutter/flutter#103540

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Linux] Fix CJK input #31813

[Linux] Fix CJK input #31813

Uh oh!

cbracken commented Mar 3, 2022 •

edited

Loading

Uh oh!

flutter-dashboard bot commented Mar 3, 2022

Uh oh!

justinmc left a comment

Uh oh!

cbracken commented Mar 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Linux] Fix CJK input #31813

[Linux] Fix CJK input #31813

Uh oh!

Conversation

cbracken commented Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-launch Checklist

Uh oh!

flutter-dashboard bot commented Mar 3, 2022

Uh oh!

justinmc left a comment

Choose a reason for hiding this comment

Uh oh!

cbracken commented Mar 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cbracken commented Mar 3, 2022 •

edited

Loading