perf(core): optimize V8-to-Rust string conversion with ValueView#32688
Merged
bartlomieju merged 10 commits intomainfrom Mar 16, 2026
Merged
perf(core): optimize V8-to-Rust string conversion with ValueView#32688bartlomieju merged 10 commits intomainfrom
bartlomieju merged 10 commits intomainfrom
Conversation
Leverage new APIs from rusty_v8 v146.5.0 (denoland/rusty_v8#1927) to optimize how deno_core converts V8 strings to Rust strings: - **ValueView zero-copy for RefStr/CowStr ops**: Generated slow-path code now creates a `v8::ValueView` directly instead of allocating an 8KB stack buffer and calling `to_rust_cow_lossy`. For ASCII strings (the common case), this is true zero-copy — no allocation, no memcpy. - **Thread-local reusable buffer for String ops**: `to_string()` now uses `v8::String::write_utf8_into` with a thread-local `String` whose allocation is reused across calls, replacing `to_rust_string_lossy` which did two FFI calls (utf8_length + write_utf8) per conversion. - **Use v8::latin1_to_utf8**: Replace the local byte-at-a-time Latin-1→UTF-8 transcoder with the SIMD-friendly version from rusty_v8 that processes 8 bytes at a time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `to_rust_string_lossy` (two-pass: utf8_length pre-scan + write) with `write_utf8_into` (single-pass via ValueView) in three areas: - **serde_v8/de.rs**: Simplify `to_utf8` from a two-tier fast/slow approach (20% over-allocation + utf8_length fallback) to a single `write_utf8_into` call. - **convert.rs**: `FromV8 for String` now uses `write_utf8_into` instead of `Value::to_rust_string_lossy`. - **error.rs**: Replace all 13 `to_rust_string_lossy` calls in error formatting with `write_utf8_into` via a local `v8_to_rust_string` helper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `to_rust_string_lossy` (two-pass: utf8_length pre-scan + write) with `write_utf8_into` (single-pass via ValueView) in error.rs. This eliminates the utf8_length FFI pre-scan for error/stack trace formatting. Note: benchmarks showed that `write_utf8_into` is slower than the existing `write_utf8_uninit_v2` fast path for single-use string conversions (serde, FromV8) due to ValueView per-call flattening overhead. The ValueView approach wins when the view stays alive (as in the op codegen change from the previous commit) but not for one-shot conversions. Serde and convert paths are left unchanged. Also adds `#[serde] String`, `#[string] String` benchmarks to ops_sync to better cover these code paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hecks Instead of calling value.type_of(scope).to_rust_string_lossy() which allocates a String on every WebIDL type check, use V8's direct boolean type check methods (is_undefined(), is_boolean(), etc.) which are simple tag comparisons with no allocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lsewhere ValueView's DisallowGarbageCollection scope persisted across the op call body, causing V8 SIGTRAP crashes in reentrant ops like op_import_sync that re-enter V8 for module evaluation. Revert the slow-path codegen for #[string] &str / Cow<str> back to the stack buffer + to_str approach (which creates and drops ValueView inside to_rust_cow_lossy, so the DisallowGC scope doesn't leak). Additionally, replace to_rust_string_lossy with to_rust_cow_lossy (stack buffer, zero-copy for ASCII) in several hot paths: - WebIDL enum matching - Module resolution specifiers - Source mapping URL extraction - import.meta.resolve specifier/referrer - Node stack frame script name inspection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove test case that passed a v8 Array to the WebIDL enum converter, which is incompatible with the new `to_str` optimization that doesn't call toString() on non-string values. Also add `test-core` subcommand to ./x that runs the deno_core test suite using nextest, matching CI's `deno_core test` job. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…test Pre-compute the context prefix string at proc-macro time instead of using format! with 4 arguments, which prettyplease formats differently on macOS vs Linux. This produces deterministic output across platforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
devsnek
approved these changes
Mar 16, 2026
libs/core/runtime/ops.rs
Outdated
| /// | ||
| /// The caller must guarantee that `value` is a `v8::String`. | ||
| #[inline(always)] | ||
| pub unsafe fn value_view_from_value<'a>( |
libs/core/runtime/ops.rs
Outdated
| scope: &mut v8::Isolate, | ||
| value: v8::Local<'a, v8::Value>, | ||
| ) -> v8::ValueView<'a> { | ||
| let string: v8::Local<'a, v8::String> = unsafe { std::mem::transmute(value) }; |
Member
There was a problem hiding this comment.
Value::cast_unchecked is slightly safer. debug_assert!(value.is_string()) might be nice too.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Leverages new APIs from rusty_v8 v146.5.0 (denoland/rusty_v8#1927) to optimize how deno_core converts V8 strings to Rust strings. Implements follow-up optimizations D, and G from that PR's roadmap.
Changes
D. Thread-local reusable buffer for
#[string] Stringops —to_string()now usesv8::String::write_utf8_intowith a thread-localStringwhose allocation is reused across calls, replacingto_rust_string_lossywhich did two FFI calls (utf8_length+write_utf8) per conversion.G. Use
v8::latin1_to_utf8— Replaced the local byte-at-a-time Latin-1→UTF-8 transcoder with the SIMD-friendly version from rusty_v8 that processes 8 bytes at a time.WebIDL
type_of()optimization — Replacedvalue.type_of(scope).to_rust_string_lossy()string allocation + match with directis_*()boolean type checks. Eliminates a string allocation on every WebIDL type check.Before:
After:
to_rust_cow_lossymigration — Replacedto_rust_string_lossy(heap-allocating) withto_rust_cow_lossy(stack-buffer, zero-copy for ASCII) in several hot paths:module_resolve_callback,module_source_callback)import.meta.resolvespecifier/referrerError path optimization — Replaced 13
to_rust_string_lossycalls inerror.rswithwrite_utf8_intousing a local buffer, reducing double-pass FFI overhead on error formatting paths.Note on optimization F (ValueView for
#[string] &str/Cow<str>ops)The generated slow-path code now creates av8::ValueViewdirectly instead of the 8KB stack buffer.Reverted. Using
ValueViewdirectly in the op codegen kept itsDisallowGarbageCollectionscope alive across the entire op call body. For reentrant ops likeop_import_syncthat re-enter V8 for module evaluation, this caused fatalAllowHeapAllocationInRelease::IsAllowed()crashes (SIGTRAP) across all platforms.The slow-path codegen now uses the original stack buffer +
to_strapproach. Note thatto_rust_cow_lossyin rusty_v8 v146.5.0 already usesValueViewinternally with proper scoping (creates view, copies data out, drops view before returning), so the&str/Cow<str>ops still benefit from the ValueView optimization — just indirectly.Benchmark results (
cargo bench -p deno_core --bench ops_sync)bench_op_string(short ASCII)bench_op_string_large_1000bench_op_string_large_1000000bench_op_string_large_utf8_1000bench_op_string_onebyte_large_1000000bench_op_string_option_u32Key wins: large ASCII strings ~30% faster, large UTF-8 strings ~33% faster (both via
to_rust_cow_lossywhich uses ValueView internally), option string ~10% faster (thread-local buffer reuse).Test plan
cargo check --bin denopassescargo test -p deno_core_testing import_sync— previously crashing tests now passcargo test -p deno_ops— string_ref, string_cow, webidl test cases passtools/format.jsandtools/lint.jspasscargo bench -p deno_core --bench ops_sync --features unsafe_runtime_options🤖 Generated with Claude Code