Tolerate trailing/duplicate whitespace in PLY headers#256
Merged
Conversation
parseHeader tokenized header lines with split(' ') and required element/property
lines to have exactly 3 tokens, so a space-padded vertex count (e.g.
'element vertex 258951032 ', as written by some exporters) was rejected with
'invalid ply header'. Trim and split on whitespace instead, and split lines on
/\r?\n/ so CRLF headers are also accepted. Clean headers parse unchanged.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR aims to make the PLY header parser more tolerant of non-canonical formatting, specifically duplicate/trailing whitespace in header lines and CRLF (\r\n) line endings, so that PLYs written by a wider range of tools load successfully.
Changes:
- Update header line splitting to use
\r?\ninstead of\n. - Update per-line tokenization to
trim().split(/\s+/)to ignore trailing/duplicate whitespace.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…es, add test - Drop the CRLF line-split change (readPly's magic/end_header scan is LF-only, so it was incomplete); revert to splitting on '\n'. Scope is now purely whitespace tolerance. - Derive each header line from its trimmed form: skip whitespace-only lines and read comment text from the trimmed line so leading whitespace no longer truncates it. - Add a formats test covering a space-padded vertex count, duplicate spaces, a whitespace-only line, and a leading-space comment.
Decode the whole encoded buffer and assert the end_header terminator is found, instead of searching only the first 4096 bytes (which would silently break if the header ever exceeded that).
slimbuck
approved these changes
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
parseHeadertokenized each header line withsplit(' ')and requiredelement/propertylines to have exactly 3 tokens. A space-padded count — e.g.element vertex 258951032(written by some exporters / large-PLY tools) — or duplicate spaces between tokens produced extra empty tokens, pushing the token count past 3, so the header was rejected withinvalid ply header.Fix:
strings[i].split(' ').filter(Boolean), dropping the empty tokens produced by trailing or duplicate spaces.That's the whole change — one line.
Scope (deliberately minimal):
This tolerates only extra spaces between or after tokens. It does not add support for:
comment)Those aren't valid per the PLY spec — every header line is defined as "a carriage-return terminated ASCII string that begins with a keyword" (spec) — so malformed lines like those still throw, as before. Clean headers parse exactly as they did (no behavior change).
Testing:
formats.test.mjscase covering a space-padded vertex count and duplicate spaces between tokens.