fix(common/core/web): optimizes transform construction#5248
Conversation
| let text = `Did you ever hear the Tragedy of Darth Plagueis the wise? I thought not. | ||
| It's not a story the Jedi would tell you. It's a Sith legend. Darth Plagueis was a | ||
| Dark Lord of the Sith, so powerful and so wise he could use the Force to influence | ||
| the midichlorians to create life... He had such a knowledge of the dark side that | ||
| he could even keep the ones he cared about from dying. The dark side of the Force | ||
| is a pathway to many abilities some consider to be unnatural. He became so powerful... | ||
| the only thing he was afraid of was losing his power, which eventually, of course, | ||
| he did. Unfortunately, he taught his apprentice everything he knew, then his | ||
| apprentice killed him in his sleep. It's ironic he could save others from death, | ||
| but not himself.`; // Sheev Palpatine, in the Star Wars prequels. |
There was a problem hiding this comment.
Had to pick something to use for a "long text" test case.
When I copied this 512 times over, the unit test took near a second to run. Somehow, for both the original and the new versions. (I definitely expected longer for the 'original'.) Not sure if there was some odd caching effect happening or what.
Using a single copy of this as the initial value of the "robust testing" page's text area works quite well for interactively testing the change. Simply insert within the element here, between the tags:
mcdurdin
left a comment
There was a problem hiding this comment.
Well, the algorithm change does make a big difference, for sure. The same test on my Android phone went from 400+ msec script time to 30 msec script time. That's a big enough improvement that we can probably justify merging this and backporting to stable-14.0.
Still, it does just move the goalposts. As expected, when I tested with longer documents, I observed linear growth in script time. So this still fails with long documents -- and not all that long really, with just over 1 page of text in a note, I saw 160msec/key event which is barely tolerable. This is on a relatively fast device (for Android), so slower devices would probably be completely unusable with that much content.
The big problem in this whole algorithm is that it is ever scanning the whole document. That's not limited to this one function, but to the whole approach, I think. That seems completely unnecessary: we're never going to have a keyboard that affects more than the last few characters.
We should be working with a sliding window, say 128 code units either side of the caret. Is this possible to achieve? We may still need to iterate through the document once to get our relative code point base for the sliding window but that should be done just once for a keystroke and would then be relatively inconsequential.
I know that all our indexing is by document position, so it may be a significant challenge to rewrite it. But improved algorithms are only going to take us so far and I think we may need to consider this for a full fix.
As noted, this is important... but is also definitely not trivial. In terms of context passed through to the The major pain point would be retooling how suggestions and reversions are applied. The process for both involves comparing the "before" snapshot with the current context state, which gets tricky when context is represented by sliding windows. It does seem possible, though. I do think we should have a "design" round before making such a change. There are a number of ramifications to think through, especially given how critical context synchronization is when embedded within the mobile apps. |
Agreed. Let's go ahead with this fix; as a stop-gap it will definitely help, but continue to consider a full fix in a future update. |
mcdurdin
left a comment
There was a problem hiding this comment.
LGTM. Let's cherry-pick this to stable-14.0
|
Future work documented as #5258. |
|
Changes in this pull request will be available for download in Keyman version 15.0.65-alpha |
Addresses #5246.
In an isolated, headless unit test environment, this didn't seem to get that much improvement despite changing a particularly inefficient part of the algorithm from O(N^2) to O(N). (Maybe there were caching effects somehow?)
That said... it did make a remarkable difference in responsiveness when I used the KMW "predictive text: robust testing" page; keystroke processing as a whole took less time than focusing operations as a result.
I want to test it out a bit more before leaving draft, but it does seem to yield some significant improvement. Things still struggle with very long text, of course.