Problem
While investigating adding wasm support to py-tree-sitter, I noticed that it isn't possible to reuse TSWasmStore objects. The goal is to create a WASM store, load a bunch of languages into it, and then parse a bunch of files in those languages.
- Attempt 1: Create a store, load all languages into it, create a parser for each language, share the store between all parsers.
Problem 1: The parser assumes ownership over the passed store and frees it in ts_parser_set_wasm_store (called from ts_parser_delete).
- Attempt 2: Create a store, load all languages into it, create a parser for each language, using a new store for each parser (inspired by this comment).
Problem 2: The store assumes ownership over its engine and frees it in ts_wasm_store_delete.
- Attempt 3: Same as before, but using a separate engine for each parser.
Problem 3: It is not possible to use a type from another engine in a store .
- Attempt 4: For each parser, create a separate engine, dedicated store, load all relevant languages into the store.
Problem 4: Non-ergonomic and inefficient.
Now, I don't know the point of sharing a store between languages. I assume it's because embedded languages need to be in the same store, but I'm just speculating. I can certainly see value in have a different store for each parser, to control memory usage. Based on this, I'm going to suggest that the ideal architecture allows parsers, languages, and stores to share the underlying resources with reference counting (this is already how things are implemented internally in wasmtime).
- In wasmtime, create a C API to expose cloning engine references.
- Modify
ts_wasm_store_new clone its passed engine reference.
- Modify
TSWasmStore to be a reference counted object.
- Deprecate
ts_parser_take_wasm_store; if necessary, replace with a ts_parser_get_wasm_store method that returns a new reference.
Would such a set of modifications be approved in a pull request? Also, I'm not sure what the use case is for ts_parser_take_wasm_store, so I'd appreciate feedback on that aspect.
Steps to reproduce
Create a WASM store, load a bunch of languages into it, create a parser for each language, free everything.
Expected behavior
No crash from double-free.
Tree-sitter version (tree-sitter --version)
Tree-sitter 0.22.6
Operating system/version
macos 12.7.1
Problem
While investigating adding wasm support to
py-tree-sitter, I noticed that it isn't possible to reuse TSWasmStore objects. The goal is to create a WASM store, load a bunch of languages into it, and then parse a bunch of files in those languages.Problem 1: The parser assumes ownership over the passed store and frees it in ts_parser_set_wasm_store (called from
ts_parser_delete).Problem 2: The store assumes ownership over its engine and frees it in ts_wasm_store_delete.
Problem 3: It is not possible to use a type from another engine in a store .
Problem 4: Non-ergonomic and inefficient.
Now, I don't know the point of sharing a store between languages. I assume it's because embedded languages need to be in the same store, but I'm just speculating. I can certainly see value in have a different store for each parser, to control memory usage. Based on this, I'm going to suggest that the ideal architecture allows parsers, languages, and stores to share the underlying resources with reference counting (this is already how things are implemented internally in wasmtime).
ts_wasm_store_newclone its passed engine reference.TSWasmStoreto be a reference counted object.ts_parser_take_wasm_store; if necessary, replace with ats_parser_get_wasm_storemethod that returns a new reference.Would such a set of modifications be approved in a pull request? Also, I'm not sure what the use case is for
ts_parser_take_wasm_store, so I'd appreciate feedback on that aspect.Steps to reproduce
Create a WASM store, load a bunch of languages into it, create a parser for each language, free everything.
Expected behavior
No crash from double-free.
Tree-sitter version (tree-sitter --version)
Tree-sitter 0.22.6
Operating system/version
macos 12.7.1