Fix GPU decimation on large scenes and stop swallowing GPU failures#254
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses GPU decimation failures on large scenes by restructuring GPU-side data layouts to avoid WebGPU per-binding size limits, batching edge uploads to reduce memory pressure, and making GPU failures fail-loud instead of silently producing degenerate output.
Changes:
- Split appearance (SH) coefficients into up to three ≤16-column storage buffers and merge positions+scalars into a single packed buffer for the edge-cost kernel.
- Batch edge uploads per dispatch (instead of binding the full N·k edge list) to reduce VRAM usage and remove edges as a per-binding size cap.
- Centralize GPU error escalation in the Node WebGPU device factory, and make
simplifyGaussiansthrow when decimation cannot make progress (no edges / no finite-cost pairs).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/lib/gpu/gpu-edge-cost.ts | Updates WGSL + buffer/bindings to use packed geometry, chunked appearance buffers, and batch-wise edge uploads for large-scene robustness. |
| src/lib/data-table/decimate.ts | Updates GPU packing to match the new kernel layout and adds fail-loud guards when GPU decimation produces no usable work. |
| src/cli/node-device.ts | Adds centralized WebGPU uncapturederror / device-lost escalation to avoid silently continuing after GPU failures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes two related GPU-decimation problems on large scenes, plus a fail-loud robustness change.
Problems
maxStorageBufferBindingSize) at ~11.2M splats with 48 SH columns — decimation crashed withappearance buffer … exceeds device maxStorageBufferBindingSize.simplifyGaussianssilently returned its input. In the streamed-SOG LOD chain (each level decimated from the previous) this cascaded into several identical full-resolution levels with a clean exit code.Changes
uncapturederror+ device-lost): log the precise cause and escalate to a non-zero exit, so every GPU consumer fails loudly instead of writing degenerate output.simplifyGaussiansnow throws if it cannot reach the target (zero k-NN edges or no finite-cost pairs) at any iteration, instead of returning a partial or full-resolution scene.Verification
.splaton the EC2 instance is the end-to-end confirmation.