Skip to content

perf(core): optimize V8-to-Rust string conversion with ValueView#32688

Merged
bartlomieju merged 10 commits intomainfrom
perf/valueview-string-optimizations
Mar 16, 2026
Merged

perf(core): optimize V8-to-Rust string conversion with ValueView#32688
bartlomieju merged 10 commits intomainfrom
perf/valueview-string-optimizations

Conversation

@bartlomieju
Copy link
Copy Markdown
Member

@bartlomieju bartlomieju commented Mar 13, 2026

Summary

Leverages new APIs from rusty_v8 v146.5.0 (denoland/rusty_v8#1927) to optimize how deno_core converts V8 strings to Rust strings. Implements follow-up optimizations D, and G from that PR's roadmap.

Changes

D. Thread-local reusable buffer for #[string] String opsto_string() now uses v8::String::write_utf8_into with a thread-local String whose allocation is reused across calls, replacing to_rust_string_lossy which did two FFI calls (utf8_length + write_utf8) per conversion.

G. Use v8::latin1_to_utf8 — Replaced the local byte-at-a-time Latin-1→UTF-8 transcoder with the SIMD-friendly version from rusty_v8 that processes 8 bytes at a time.

WebIDL type_of() optimization — Replaced value.type_of(scope).to_rust_string_lossy() string allocation + match with direct is_*() boolean type checks. Eliminates a string allocation on every WebIDL type check.

Before:

match value.type_of(scope).to_rust_string_lossy(scope).as_str() {
    "undefined" => Type::Undefined,
    "boolean" => Type::Boolean,
    // ...
}

After:

if value.is_undefined() {
    Type::Undefined
} else if value.is_boolean() {
    Type::Boolean
// ...
}

to_rust_cow_lossy migration — Replaced to_rust_string_lossy (heap-allocating) with to_rust_cow_lossy (stack-buffer, zero-copy for ASCII) in several hot paths:

  • WebIDL enum variant matching
  • Module resolution specifiers (module_resolve_callback, module_source_callback)
  • Source mapping URL extraction
  • import.meta.resolve specifier/referrer
  • Node stack frame script name inspection

Error path optimization — Replaced 13 to_rust_string_lossy calls in error.rs with write_utf8_into using a local buffer, reducing double-pass FFI overhead on error formatting paths.

Note on optimization F (ValueView for #[string] &str / Cow<str> ops)

The generated slow-path code now creates a v8::ValueView directly instead of the 8KB stack buffer.

Reverted. Using ValueView directly in the op codegen kept its DisallowGarbageCollection scope alive across the entire op call body. For reentrant ops like op_import_sync that re-enter V8 for module evaluation, this caused fatal AllowHeapAllocationInRelease::IsAllowed() crashes (SIGTRAP) across all platforms.

The slow-path codegen now uses the original stack buffer + to_str approach. Note that to_rust_cow_lossy in rusty_v8 v146.5.0 already uses ValueView internally with proper scoping (creates view, copies data out, drops view before returning), so the &str/Cow<str> ops still benefit from the ValueView optimization — just indirectly.

Benchmark results (cargo bench -p deno_core --bench ops_sync)

Benchmark Before (ns/iter) After (ns/iter) Change
bench_op_string (short ASCII) 5,459 5,443 ~same
bench_op_string_large_1000 76,030 73,201 -3.7%
bench_op_string_large_1000000 281,505 195,977 -30.4%
bench_op_string_large_utf8_1000 4,107,966 2,766,908 -32.6%
bench_op_string_onebyte_large_1000000 118,194 107,971 -8.6%
bench_op_string_option_u32 23,349 21,112 -9.6%

Key wins: large ASCII strings ~30% faster, large UTF-8 strings ~33% faster (both via to_rust_cow_lossy which uses ValueView internally), option string ~10% faster (thread-local buffer reuse).

Test plan

  • cargo check --bin deno passes
  • cargo test -p deno_core_testing import_sync — previously crashing tests now pass
  • cargo test -p deno_ops — string_ref, string_cow, webidl test cases pass
  • tools/format.js and tools/lint.js pass
  • Benchmarked with cargo bench -p deno_core --bench ops_sync --features unsafe_runtime_options

🤖 Generated with Claude Code

bartlomieju and others added 9 commits March 13, 2026 12:54
Leverage new APIs from rusty_v8 v146.5.0 (denoland/rusty_v8#1927) to
optimize how deno_core converts V8 strings to Rust strings:

- **ValueView zero-copy for RefStr/CowStr ops**: Generated slow-path code
  now creates a `v8::ValueView` directly instead of allocating an 8KB stack
  buffer and calling `to_rust_cow_lossy`. For ASCII strings (the common
  case), this is true zero-copy — no allocation, no memcpy.

- **Thread-local reusable buffer for String ops**: `to_string()` now uses
  `v8::String::write_utf8_into` with a thread-local `String` whose
  allocation is reused across calls, replacing `to_rust_string_lossy` which
  did two FFI calls (utf8_length + write_utf8) per conversion.

- **Use v8::latin1_to_utf8**: Replace the local byte-at-a-time Latin-1→UTF-8
  transcoder with the SIMD-friendly version from rusty_v8 that processes 8
  bytes at a time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `to_rust_string_lossy` (two-pass: utf8_length pre-scan + write)
with `write_utf8_into` (single-pass via ValueView) in three areas:

- **serde_v8/de.rs**: Simplify `to_utf8` from a two-tier fast/slow
  approach (20% over-allocation + utf8_length fallback) to a single
  `write_utf8_into` call.

- **convert.rs**: `FromV8 for String` now uses `write_utf8_into`
  instead of `Value::to_rust_string_lossy`.

- **error.rs**: Replace all 13 `to_rust_string_lossy` calls in error
  formatting with `write_utf8_into` via a local `v8_to_rust_string`
  helper.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `to_rust_string_lossy` (two-pass: utf8_length pre-scan + write)
with `write_utf8_into` (single-pass via ValueView) in error.rs. This
eliminates the utf8_length FFI pre-scan for error/stack trace formatting.

Note: benchmarks showed that `write_utf8_into` is slower than the existing
`write_utf8_uninit_v2` fast path for single-use string conversions (serde,
FromV8) due to ValueView per-call flattening overhead. The ValueView
approach wins when the view stays alive (as in the op codegen change from
the previous commit) but not for one-shot conversions. Serde and convert
paths are left unchanged.

Also adds `#[serde] String`, `#[string] String` benchmarks to ops_sync
to better cover these code paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hecks

Instead of calling value.type_of(scope).to_rust_string_lossy() which
allocates a String on every WebIDL type check, use V8's direct boolean
type check methods (is_undefined(), is_boolean(), etc.) which are
simple tag comparisons with no allocation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lsewhere

ValueView's DisallowGarbageCollection scope persisted across the op call
body, causing V8 SIGTRAP crashes in reentrant ops like op_import_sync
that re-enter V8 for module evaluation.

Revert the slow-path codegen for #[string] &str / Cow<str> back to the
stack buffer + to_str approach (which creates and drops ValueView inside
to_rust_cow_lossy, so the DisallowGC scope doesn't leak).

Additionally, replace to_rust_string_lossy with to_rust_cow_lossy (stack
buffer, zero-copy for ASCII) in several hot paths:
- WebIDL enum matching
- Module resolution specifiers
- Source mapping URL extraction
- import.meta.resolve specifier/referrer
- Node stack frame script name inspection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove test case that passed a v8 Array to the WebIDL enum converter,
which is incompatible with the new `to_str` optimization that doesn't
call toString() on non-string values.

Also add `test-core` subcommand to ./x that runs the deno_core test
suite using nextest, matching CI's `deno_core test` job.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…test

Pre-compute the context prefix string at proc-macro time instead of
using format! with 4 arguments, which prettyplease formats differently
on macOS vs Linux. This produces deterministic output across platforms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
///
/// The caller must guarantee that `value` is a `v8::String`.
#[inline(always)]
pub unsafe fn value_view_from_value<'a>(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused code?

scope: &mut v8::Isolate,
value: v8::Local<'a, v8::Value>,
) -> v8::ValueView<'a> {
let string: v8::Local<'a, v8::String> = unsafe { std::mem::transmute(value) };
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value::cast_unchecked is slightly safer. debug_assert!(value.is_string()) might be nice too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bartlomieju bartlomieju enabled auto-merge (squash) March 16, 2026 11:36
@bartlomieju bartlomieju merged commit 4a19007 into main Mar 16, 2026
219 of 222 checks passed
@bartlomieju bartlomieju deleted the perf/valueview-string-optimizations branch March 16, 2026 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants