Skip to content

Rewrite the web bindings in TypeScript (& more)#4121

Merged
amaanq merged 16 commits intotree-sitter:masterfrom
amaanq:web-ts-ts
Jan 21, 2025
Merged

Rewrite the web bindings in TypeScript (& more)#4121
amaanq merged 16 commits intotree-sitter:masterfrom
amaanq:web-ts-ts

Conversation

@amaanq
Copy link
Member

@amaanq amaanq commented Jan 16, 2025

Problem

The web bindings feel hard to maintain, and also hard to debug. Additionally, there's a few missing components for the web bindings that would aid downstream users immensely - that being bundling the sourcemaps, and having an option to install a "debug" build of web-tree-sitter.

Solution

We've implemented the missing components outlined above, as well as rewriting the library to TypeScript, as it feels much easier to navigate than before with the types being laid out pretty nicely (including the FFI functions!). However, with this change comes some (hopefully) positive changes to the build process.

Because we're using TypeScript, we have a new compilation step added relies on esbuild to bundle our TypeScript files into one JavaScript file with a sourcemap.

However, sourcemaps were a particular pain to implement for the final JS file, because we have two "phases" of compilation - one from TS -> JS (esbuild), and one from JS -> Emscripten-JS. Emscripten doesn't provide a way to generate sourcemaps for the generated JS code, as it just copy-pastes our post-js as is. In theory we should be able to use our TS -> JS source map since our source code wasn't modified, but that's not the case because the line numbers have shifted. As such, we have to run a post-processing step to "relocate" the sourcemap line number information. This ends up working pretty well, but was not fun to figure out.

The above is no longer true, since I've rewritten the way the higher-level API interacts with the WASM module. Instead of gluing it all together inside the module with pre-js and post-js hacks, we're instead consuming the WASM module like a normal module, and bundling that with the final JS code. This is way easier, and doesn't require hacks to the sourcemaps.

Debug Info

For debug information, we build the wasm files with features such as assertions and debug symbols. We store these in the debug folder, and optionally allow a user to use them with a sub-import, web-tree-sitter/debug. Effectively, tacking on /debug to your import will let you automatically use the debug build to see what's going on.

In GH releases, these debug files would show up as tree-sitter-debug.{js|wasm}(.map)?.

TypeScript, CommonJS, and ESM

Rewriting the library in TypeScript also helped me improve the public type definitions, as we can autogenerate them now, and writing the tests in TypeScript was just a much smoother process overall. Hopefully others feel the same way as they peruse and navigate the codebase (and potentially contribute back, as I've added a nice little contributing.md file).

Some notes for readers - we now actually support both CommonJS and ESM modules, by invoking emcc with EXPORT_ES6=1, and esbuild with the format set to cjs and esm.

The default JS file is now an ES6 module, and will be the file that's uploaded to GH releases. For users using the NPM registry, this is automatically taken care of for you with conditional exports (including for the debug sub-export).

I'll let this sit for a bit so people can try it out locally.

cc @verhovsky @wenkokke I think you'd both be interested in these changes (sorry for the ping if that's no longer true)

@amaanq amaanq force-pushed the web-ts-ts branch 5 times, most recently from ebafdcf to a954950 Compare January 16, 2025 07:36
@verhovsky
Copy link
Contributor

verhovsky commented Jan 16, 2025

we have to run a post-processing step to "relocate" the sourcemap line number information. This ends up working pretty well, but was not fun to figure out.

my condolences lol. If there's an easy way to do it, it would be nice to sanity check the source map in a post processing lint step. Something like after it's generated do a jump to definition (if there's a way to do a jump to definition without spinning up an entire IDE) on some function and check the line you jump to actually contains that function name.

@WillLillis
Copy link
Member

If there's an easy way to do it, it would be nice to sanity check the source map in a post processing lint step. Something like after it's generated do a jump to definition (if there's a way to do a jump to definition without spinning up an entire IDE) on some function and check the line you jump to actually contains that function name.

This could be done with Neovim running in headless mode with a few autocommands in its init.lua. It's a little hacky, but I'm using a similar approach to write an LSP integration test library currently. :)

Copy link
Contributor

@maxbrunsfeld maxbrunsfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good change.

I worry a little bit about the +- count here: the codebase has grown by more than 5000 lines, and I can't tell exactly why.

I left a couple of questions.

@amaanq
Copy link
Member Author

amaanq commented Jan 17, 2025

I worry a little bit about the +- count here: the codebase has grown by more than 5000 lines, and I can't tell exactly why.

A lot of that is from package-lock.json (about 4600), and from the tests being moved over to ts files using vitest (hence the large minus diff as well, but overall not a large gain)

@maxbrunsfeld
Copy link
Contributor

Ah ok, makes sense about the package lock making up a large portion of the added line count.

@amaanq amaanq force-pushed the web-ts-ts branch 8 times, most recently from 4d4eec7 to a3f10c0 Compare January 19, 2025 21:08
@amaanq amaanq force-pushed the web-ts-ts branch 9 times, most recently from 3533451 to 1a936ca Compare January 20, 2025 19:23
@savetheclocktower
Copy link
Contributor

OK, the CJS build works great for me without any further tweaking.

I'm curious to see how the imports will work once the module is published, but all that is done via CI right now, so I'll have to wait and see.

@amaanq amaanq force-pushed the web-ts-ts branch 4 times, most recently from 8862d34 to 81d2d32 Compare January 21, 2025 06:28
@amaanq amaanq merged commit 79244b5 into tree-sitter:master Jan 21, 2025
20 checks passed
Copy link
Contributor

@maxbrunsfeld maxbrunsfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is web-tree-sitter.d.ts a generated file now?

@amaanq
Copy link
Member Author

amaanq commented Jan 21, 2025

Yes, we use dts-buddy to autogenerate it from our TypeScript files - surprisingly it's not built-in to the typescript compiler to generate public types easily.

This can be updated with npm run build:dts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create flag for installing web-tree-sitter built with —debug

6 participants