Skip to content

Validate Wasm language memory reads#5569

Merged
maxbrunsfeld merged 1 commit into
tree-sitter:masterfrom
ZX41R:fix-wasm-language-memory-bounds
May 6, 2026
Merged

Validate Wasm language memory reads#5569
maxbrunsfeld merged 1 commit into
tree-sitter:masterfrom
ZX41R:fix-wasm-language-memory-bounds

Conversation

@ZX41R

@ZX41R ZX41R commented May 4, 2026

Copy link
Copy Markdown
Contributor

Validate offsets from LanguageInWasmMemory before copying language data out of Wasm memory.

The language descriptor is provided by the module being loaded. Before this change, offsets such as parse_table, symbol_names, lex_modes, and the alias/supertype tables were used directly as indexes into the store's memory buffer. A malformed module could point one of those fields outside the current linear memory and make the host process read through an invalid pointer while loading the language.

This adds a small checked-memory wrapper for Wasm language loading and routes descriptor reads, table copies, string reads, and the alias-map scan through it. Invalid descriptor addresses now fail loading with TSWasmErrorKindInstantiate.

Local checks:

  • Wasm-enabled CMake build with ASan/UBSan
  • normal CMake build with ASan/UBSan
  • malformed local Wasm module with parse_table = 0x7ffffff0 is rejected instead of crashing
  • in-bounds control module still loads
  • git diff --check

AI disclosure: I used assistant support during local investigation and repro drafting; I reviewed the code and verified the patch.

@maxbrunsfeld

Copy link
Copy Markdown
Contributor

Thank you! Great improvement.

@maxbrunsfeld maxbrunsfeld merged commit 21cfae7 into tree-sitter:master May 6, 2026
25 checks passed
@clason clason added the ci:backport release-0.26 Backport label label May 19, 2026
@tree-sitter-ci-bot

Copy link
Copy Markdown

Successfully created backport PR for release-0.26:

clason pushed a commit that referenced this pull request May 19, 2026
Problem: Offsets such as parse_table, symbol_names, lex_modes, and the alias/supertype tables are used directly as indexes into the store's memory buffer. A malformed module can point one of those fields outside the current linear memory and make the host process read through an invalid pointer while loading the language.

Solution: Add a small checked-memory wrapper for Wasm language loading and routes descriptor reads, table copies, string reads, and the alias-map scan through it. Invalid descriptor addresses now fail loading with TSWasmErrorKindInstantiate.

(cherry picked from commit 21cfae7)

Co-authored-by: 𝙽!𝙻 <z_hakmi@estin.dz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants