Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Conversation

@mdebbar
Copy link
Contributor

@mdebbar mdebbar commented Feb 13, 2023

This PR provides all client-ICU data to CanvasKit from browser APIs:

  • Line breaks: from v8BreakIterator (only available in Chromium).
  • Word boundaries: from Intl.Segmenter (available in all browsers except Firefox).
  • Grapheme boundaries: from Intl.Segmenter (available in all browsers except Firefox).

Fixes flutter/flutter#118801

@flutter-dashboard flutter-dashboard bot added the platform-web Code specifically for the web engine label Feb 13, 2023
Copy link
Contributor

@yjbanov yjbanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also recommend running an A/B test using some of our text benchmarks.

@override
Uint32List fragment() {
final DomIteratorWrapper<DomSegment> iterator =
createIntlSegmenter(granularity: _granularityString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here. Is IntlSegmenter stateful? Can we memoize one for each granularity and avoid creating them on every string? This will become more relevant with wasm, where we probably want to avoid too many wasm-js hops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Regarding the wasm thing: I don't think this memoization makes a difference. The majority of chatting with js happens after creating the Intl.Segmenter (calling segment() and iterating over the result). If it becomes an issue, we can implement a JS function that wraps this entire block of code and does all the chatting on the JS side and returns the final result (i.e. Uint32List) to wasm. Not sure how this can be done in practice with dart2wasm.

cc @eyebrowsoffire to keep in mind that this might be needed for the skwasm project. It would be nice to have some kind of annotation on methods to tell the compiler that it's better to keep the entire method in JS and not compile it to wasm. I may write some thoughts around this in a new issue.

final List<LineBreakFragment> fragments = _fragmenter.fragment();

final int size = (fragments.length + 1) * 2;
final Uint32List typedArray = Uint32List(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using CanvasKit.Malloc to avoid creating a temporary array and extra copying into wasm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up until today, I didn't really understand the benefit of using CK's Malloc. It seemed to me like the CK bindings always end up copying the arrays anyway.

But I took a closer look now, and I think I understand where this optimization is happening on the CK side:
https://github.com/google/skia/blob/deb1cf8d127986fee464946a12b248d6f17fb629/modules/canvaskit/memory.js#L126-L128

What threw me off was the name copy1dArray. But it turns out this function doesn't always copy. It first checks if the array was malloc'd and avoids the copying in that case.


/// Similar to [js_util.callMethod] but allows for providing a non-string method
/// name.
T jsCallMethod<T>(Object o, Object method, [List<Object?> args = const <Object?>[]]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to safe_browser_api.dart and wrapped as to not be publicly available to all of the engine code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's too difficult to move, can it stay private to this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the benefit of making it private, so I'll make it private.

Would you mind clarifying what's the benefit of moving it to safe_browser_api.dart?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can execute arbitrary code. It's good to avoid ad hoc calls out to JS from all over the codebase. So libraries like dom.dart and safe_browser_api.dart should provide a safe subset of the API to the rest of the engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with making it private so it doesn't get called everywhere.

Does it make a difference if we put it in dom.dart vs safe_browser_api.dart? Like is there a special handling for that file or something?

@mdebbar
Copy link
Contributor Author

mdebbar commented Feb 23, 2023

I also recommend running an A/B test using some of our text benchmarks.

Agreed. @eyebrowsoffire is working on a PR to run tests against skwasm as well as the chromium variant of CanvasKit.

That said, all of this code is behind a flag that's not enabled yet, but we definitely need to land those tests before opening this up to users.

@mdebbar mdebbar requested a review from yjbanov February 23, 2023 22:57
/// Builds the CkParagraph with the builder and deletes the builder.
SkParagraph _buildSkParagraph() {
if (browserSupportsCanvaskitChromium) {
final String text = _paragraphBuilder.getText();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to read the text string back? Don't we already have it?

Now that I'm looking at the Skia API, it feels a little too chatty. I feel like we should be able to provide all this information - text, word breaks, graphemes, line breaks - in a single method call.

But maybe we can revisit this later. If this works, let's keep it. It's not public API anyway. Or maybe I just need some likbez on why this should be this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hterkelsen for the link 😄 I was wondering about that word myself, and thought it could've been a typo.

The reason we need to ask Skia to give us the string is because SkParagraphBuilder inserts extra text into the string. E.g. when we call addPlaceholder(...), SkParagraphBuilder inserts a magical character into the string (and this is an implementation detail that could change or even go away in the future).

A little bit of history to explain how the API became chatty:

Initially, we had a single method called buildWithClientInfo() (to be called instead of build()) which takes parameters for all the information it needs (i.e. line breaks, word breaks, graphemes, (and bidi at the time)).

Then we started playing with ICU4X for the bidi information. ICU4X operates on UTF8, but browser APIs operate on UTF16. So we needed a way to indicate to Skia whether the array we are passing is UTF8 or UTF16 indices. That's how the set***Utf8() and set***Utf16() APIs were born.

Copy link
Contributor

@yjbanov yjbanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@harryterkelsen harryterkelsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@joshualitt joshualitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS Interop LGTM

@mdebbar mdebbar added the autosubmit Merge PR when tree becomes green via auto submit App label Feb 24, 2023
@auto-submit auto-submit bot merged commit b9e1359 into flutter:main Feb 24, 2023
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Feb 24, 2023
@mdebbar mdebbar deleted the ck_iculess_glue branch June 22, 2023 21:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

autosubmit Merge PR when tree becomes green via auto submit App platform-web Code specifically for the web engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[web] Implement the glue code for ICU-less CanvasKit

4 participants