Fix multi-resolution rendering: key+value sort, sub-frame split, image-relative fade#243
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reworks the WebGPU splat renderer to produce correct, resolution-consistent output across large images (multi-resolution correctness), addressing prior key-overflow corruption, tile-edge truncation artifacts, and absolute-pixel fade thresholds.
Changes:
- Replaces the old tile-stream/BVH-driven group pipeline with a new raster pass that globally depth-sorts once, then renders via sub-frame partitioning.
- Updates the GPU rasterizer to a fully GPU-resident tile-binning pipeline (coverage → prefix-sum → emit-pairs → indirect radix sort (key+value) → boundary finding → binned raster).
- Adds golden-image regression tests + a regenerator script, and adds a
pretesthook to ensuredist/exists for tests.
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/render-golden.test.mjs | Adds byte-exact golden WebP renderer regression tests driven by shared cases. |
| test/render-golden.regenerate.mjs | Adds a script to regenerate checked-in golden WebP outputs from current CLI behavior. |
| test/render-golden.cases.mjs | Centralizes golden test case definitions shared by test + regenerator. |
| src/lib/render/tile-stream.ts | Removes the prior BVH + per-tile-group pipelined renderer orchestrator. |
| src/lib/render/raster-pass.ts | Adds the new multi-resolution raster orchestrator (global sort + sub-frame split + stitching). |
| src/lib/render/index.ts | Switches renderSplats export to the new raster pass implementation. |
| src/lib/render/config.ts | Centralizes render constants (sigma cutoff, fade thresholds, tile size, pair-buffer budget, etc.). |
| src/lib/render/bvh-query.ts | Removes BVH frustum-plane query helper (no longer used). |
| src/lib/gpu/index.ts | Stops exporting TILE_SIZE from GPU module (now sourced from render config). |
| src/lib/gpu/gpu-splat-rasterizer.ts | Major rewrite: adds GPU tile-binning pipeline, indirect dispatch handling, key+value radix sorting, and binned rasterization. |
| src/lib/data-table/gaussian-aabb.ts | Uses SIGMA_CUTOFF from centralized render config to keep extents consistent with raster radius. |
| package.json | Adds pretest to build before running tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 14 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/lib/gpu/gpu-splat-rasterizer.ts:8
BUFFERUSAGE_INDIRECTis imported but not used in this file. Please remove the unused import to keep the module clean and avoid introducing new lint noise.
import {
BUFFERUSAGE_COPY_DST,
BUFFERUSAGE_COPY_SRC,
BUFFERUSAGE_INDIRECT,
SHADERLANGUAGE_WGSL,
SHADERSTAGE_COMPUTE,
UNIFORMTYPE_FLOAT,
UNIFORMTYPE_UINT,
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Renderer changes to make output correct and consistent across all resolutions. The pre-fix code worked at 1080p but silently broke at 1920×1200, 1440p, 4K, and 8K — and even where it produced output, the same world-space scene rendered very differently at different resolutions.
What was broken
(tileIdx << 19) | splatIdxinto a single u32. At any image with more than 8192 total tiles (≥ 1920×1200),tileIdx << 19overflowed u32 and aliased tile indices — silent corruption with no error.RADIUS_FADE_START_PX = 1024/RADIUS_FADE_END_PX = 2048were absolute pixel values. World-space splats whose 1080p pixel radius was 256-512 px (well below 1080p's 1024-px fade-start) projected to 1024-2048 px at 8K and were silently faded out. Result: large patches of grass, sky, and foreground content present at 1080p went missing at 8K.What changed
ComputeRadixSort.sortIndirect's key+value mode (initialValuesparam).tileKeysBufferholds tile indices,splatValuesBufferholds splat indices; both get full 32-bit range. DeletesSPLAT_IDX_BITS/SPLAT_IDX_MASKfrom config and shrinks sort passes at 1080p from 8 to ~4 (numBits auto-derived fromnumTiles).maxCoveragePerSplatis set to the sub-frame's full tile area (rounded up to a power of two), which geometrically bounds every splat's bbox-in-sub-frame — so the project shader's coverage clamp never bites and emit-pairs writes every splat's full bbox. No hard tile edges anywhere.RADIUS_FADE_START_PX/RADIUS_FADE_END_PXwithRADIUS_FADE_START_FRAC/RADIUS_FADE_END_FRAC. The project shader computes the per-render threshold asfrac × imageHeight, so the SAME world-space splats fade at every render resolution. After this fix the 8K-downsampled-to-1080p image matches 1080p direct at SSIM 0.93.groupTilesX × groupTilesY × TILE_SIZE²instead ofgroupSizeTiles²(the bounding-square of the larger axis). Saves ~25 MB at 1080p, more at non-square aspect ratios and at higher resolutions.numSlots/Slot[]/ per-slot shader names removed; rasterizer holds a singlePipelineBuffers. The "group" abstraction is retained — it's the hook the sub-frame split uses (beginGroup/dispatchChunk/finishGroupwith a group origin and tile rectangle, the project shader's group-AABB cull, and group-pixel-origin uniforms).AA_DILATION_COV,SIGMA_CUTOFF,DISCRIMINANT_FLOOR,OPACITY_CAP,MIN_ALPHA,MIN_TRANSMITTANCE,JACOBIAN_LIMIT_FACTOR,TILE_SIZE, the radius-fade fractions, the pair-buffer memory budget, and theMAX_COVERAGE_PER_SPLAT_AT_1080Pbaseline now all live insrc/lib/render/config.tswith documentation.gaussian-aabb.tsreadsSIGMA_CUTOFFfrom there too so the BVH extent and the projected splat radius stay in sync.test/render-golden.test.mjswithtiny(320×240) andmid(640×360) fixtures shared between the runner and the regenerator viatest/render-golden.cases.mjs. Byte-exact compares against checked-in webp goldens.Verification
npm run buildclean, no TypeScript warnings.node --test test/render-golden.test.mjs— both fixtures pass byte-exact (tinyandmid).npm run lint— 22 errors, all pre-existing onmain; no new errors introduced.AA_DILATION_COV/DISCRIMINANT_FLOORstill being in absolute pixel units, which is reference-3DGS behaviour).Performance and memory
chunkCapis forced down bymaxStorageBufferBindingSizeoncemaxCovcovers the whole 1080p sub-frame area (16 384 tiles). Peak GPU memory unchanged at 2.20 GB.Caveats / known follow-ups
AA_DILATION_COVandDISCRIMINANT_FLOORare still absolute pixel units. Making them resolution-invariant would close the residual 7 % SSAA gap but is an "exceed reference 3DGS" decision and a separate change.chunkCap; current scan is the placeholder from the rework plan's Open Question 1. Two-level Blelloch is the natural follow-up ifchunkCapever grows past ~1 M.seam-Nregression fixture. The rework plan called for a synthetic scene engineered to span the original 128-px tile-group boundaries; not added. The currenttiny+midfixtures plus byte-exact passing under forced sub-frame split (verified locally with 7×5 split — still byte-exact) cover the boundary case adequately for now.FAR_PLANE_NEAR_FACTORis unused after the BVH removal; left in config for now.Full reasoning trail and design alternatives considered are in the saved review at
~/.claude/plans/review-the-changes-committed-mutable-tiger.md.