Parallelize SOG writing with a cross-platform worker pool#262
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a cross-platform worker-thread pool to parallelize the two CPU-heavy steps in SOG writing (1D quantization and lossless WebP encoding), improving SOG output performance while keeping results consistent across inline/worker execution.
Changes:
- Add
WorkerQueue+ task protocol (quantize1d,encodeWebp) with Node/browser parity and inline fallback. - Refactor
quantize1dinto a dependency-free core (quantize-1d-core.ts) plus the existingDataTablewrapper. - Parallelize SOG writer texture pipelines and add CLI tuning via
--max-workers.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/worker-queue.test.mjs | Adds tests for inline-mode behavior and task correctness/failure paths. |
| src/lib/writers/write-sog.ts | Runs texture pipelines concurrently; routes WebP encode/quantize through worker tasks; attempts to serialize zip writes. |
| src/lib/workers/worker-queue.ts | Implements the worker pool, dispatch, spawning, inline fallback, and teardown. |
| src/lib/workers/worker-entry.ts | Worker-side entrypoint: init wasm URL, run tasks, return results/errors. |
| src/lib/workers/worker-bundled.ts | Build-time flag controlling whether worker transport is available. |
| src/lib/workers/tasks.ts | Defines worker/inline task handlers and message protocol types. |
| src/lib/workers/index.ts | Typed wrappers (runQuantize1d, runEncodeWebp) over WorkerQueue.run. |
| src/lib/utils/webp-codec.ts | Adds resolveWasmUrl() for passing wasm location into workers. |
| src/lib/spatial/quantize-1d.ts | Becomes a thin DataTable wrapper over quantize1dColumns. |
| src/lib/spatial/quantize-1d-core.ts | New dependency-free quantization implementation for worker bundling. |
| src/lib/index.ts | Exports WorkerQueue from the public library entrypoint. |
| src/cli/index.ts | Adds --max-workers to configure worker pool size (including 0 for inline). |
| rollup.config.mjs | Adds worker build (dist/worker.mjs) and build-time flag flipping; updates externals. |
| package.json | Exports ./worker entry to ship dist/worker.mjs. |
| AGENTS.md | Adds contributor/agent guidelines and architecture/build notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Speeds up SOG output by moving the two CPU-heavy steps —
quantize1dand lossless WebP encoding — onto a worker-thread pool, so the independent texture pipelines (means, quats, scales, colors, SH) compress in parallel instead of serially on the main thread. On large scenes this roughly halves writer wall-clock (a 2.2M-splat scene drops from ~6.2s to ~2.9s; mcmc ~2.1s → ~1.0s) with byte-identical output.Worker queue. Adds a small general-purpose
WorkerQueue(src/lib/workers/) that runs named tasks on worker threads and behaves identically in Node and the browser. Tasks are defined once intasks.tsand run either on a worker or inline on the calling thread, so results are identical regardless of transport. The pool spawns lazily, runs one task per worker, and falls back to inline execution transparently when workers are unavailable (running from source under tsx, unsupported environments, or--max-workers 0). It's reusable for other CPU-heavy work later (e.g. per-texture WebP decode on the read path).Delivery. The worker is built and shipped as
dist/worker.mjsand exported as@playcanvas/splat-transform/worker. Node and bundlers that rewritenew Worker(new URL('./worker.mjs', import.meta.url))(Vite, webpack 5) resolve it automatically; other bundlers (e.g. plain Rollup, as SuperSplat uses) setWorkerQueue.workerUrlexplicitly, mirroring the existingWebPCodec.wasmUrlpattern. The host forwards the resolved wasm URL into each worker on init, so WebP wasm resolution is unchanged.Memory and tuning. Peak memory scales with worker count, since each worker holds its own WebP wasm heap. The pool defaults to
min(4, cores − 1), and the CLI adds--max-workers <n>to trade speed for peak (0= inline/serial, which lands back at roughly the old single-threaded memory profile).Internal refactor.
quantize1dis split into a dependency-free core (quantize-1d-core.ts, raw typed arrays) plus a thinDataTablewrapper, keeping the worker bundle lean (noDataTable/playcanvas pulled in). The publicquantize1dsignature is unchanged.Verification. Output is byte-identical across worker counts and vs. the serial baseline (verified per zip entry on deterministic scenes); the full test suite passes, plus a new
test/worker-queue.test.mjscovering inline-vs-worker equivalence and the fallback paths.