Skip to content

Fix infinite hang on degenerate splats: O(N²) KD-tree build + decimate stall guard#272

Merged
slimbuck merged 3 commits into
playcanvas:mainfrom
slimbuck:kdtree-dev
Jun 29, 2026
Merged

Fix infinite hang on degenerate splats: O(N²) KD-tree build + decimate stall guard#272
slimbuck merged 3 commits into
playcanvas:mainfrom
slimbuck:kdtree-dev

Conversation

@slimbuck

@slimbuck slimbuck commented Jun 29, 2026

Copy link
Copy Markdown
Member

Problem

splat-transform --decimate hangs indefinitely on certain inputs, wedging the production GPU pipeline until its 2h timeout. Reproduced on a 9.84M-gaussian PLY where nearly every gaussian is coincident at the origin (degenerate/uninitialized capture).

There are two independent failures, both triggered by the same degenerate input:

1. O(N²) KD-tree build hang

nthElement in src/lib/spatial/kd-tree.ts used a 2-way Lomuto partition that only moves elements strictly less than the pivot. When a coordinate is all-equal, nothing moves and the selection range shrinks by one per pass — each pass O(N) — so the KD-tree build degrades from O(N log N) to O(N²). At ~10M coincident points that never completes. The stall is on the CPU before any GPU work dispatches, which is why it reproduced on both WebGPU/Vulkan and Metal.

Fix: 3-way (Dutch National Flag) partition. Equal elements collapse in a single pass, restoring O(N log N). For distinct coordinates the equal block is one element, so the median index and resulting tree are identical to before — no behavioural change for normal splats, no memory impact (in-place).

2. Decimate makes no progress on degenerate input

With every point coincident, all KNN distances are 0 and the traversal returns the same ~K shared neighbour indices for every query, so the directed-edge graph funnels onto a handful of targets. The greedy disjoint-pair matching can then only merge ~K pairs per pass (17, in the repro) instead of ~N/2, so it crawls toward the target across thousands of passes. The existing degenerate guards only fire on zero progress (edgeCount === 0, pairs.length === 0), so a tiny-but-nonzero matching slips through.

Fix: a no-progress guard. When a pass's matching is exhausted before reaching the target (pairs.length < mergesNeeded) yet removes less than MIN_ITERATION_PROGRESS (5%) of the current splats, throw a clear error rather than grind. A well-formed scene removes ~50% per exhausted pass, so legitimate multi-pass decimation never trips it; it's gated on matching exhaustion, so single-pass decimation can't trip it either.

Testing

  • New test/kd-tree.test.mjs: builds a tree over 100k coincident points (would hang pre-fix) in milliseconds; brute-force KNN correctness on sets with frequent ties.
  • New test/decimate.test.mjs case: a fully-coincident scene now throws the stall error instead of grinding.
  • Full suite green (566 tests) incl. render-golden — no tie-break or behavioural shift on valid scenes. Lint clean.
  • End-to-end on the original 9.84M PLY: previously hung 30+ min (136+ passes, never finishing); now fails in ~10s with a clear message and non-zero exit.

@slimbuck slimbuck requested a review from Copilot June 29, 2026 09:24
@slimbuck slimbuck self-assigned this Jun 29, 2026
@slimbuck slimbuck added the bug Something isn't working label Jun 29, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a pathological KD-tree build-time degradation triggered by highly degenerate point sets (many/all points sharing coordinates) by replacing the nthElement 2-way partition with a 3-way (Dutch National Flag) partition, preventing O(N²) behavior and the observed “Finding nearest neighbors” hang. Adds targeted regression tests to ensure KD-tree construction and KNN queries remain correct and terminate quickly under heavy duplication/ties.

Changes:

  • Update nthElement in src/lib/spatial/kd-tree.ts to use an in-place 3-way partition to collapse equal elements in one pass.
  • Add test/kd-tree.test.mjs covering the prior hang case (large all-identical set) and correctness on duplicate/tied coordinate sets.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
test/kd-tree.test.mjs Adds regression + correctness tests for KD-tree behavior on duplicate/tied coordinates, including the prior hang scenario.
src/lib/spatial/kd-tree.ts Replaces 2-way partition with 3-way partition in nthElement to avoid O(N²) degeneration on all-equal ranges.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@slimbuck slimbuck changed the title Fix O(N²) KD-tree build hang on splats with duplicate coordinates Fix infinite hang on degenerate splats: O(N²) KD-tree build + decimate stall guard Jun 29, 2026
@slimbuck slimbuck requested a review from Copilot June 29, 2026 10:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread test/kd-tree.test.mjs Outdated
@slimbuck slimbuck marked this pull request as ready for review June 29, 2026 10:32
@slimbuck slimbuck requested a review from a team June 29, 2026 10:32
@slimbuck slimbuck merged commit e273e02 into playcanvas:main Jun 29, 2026
3 checks passed
@slimbuck slimbuck deleted the kdtree-dev branch June 29, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants