Synthesize boundary space when accept would glue the suggestion onto the user's last word#348
Conversation
…nto the user's last word PR #303 moved the word-boundary decision to accept-time but only added the "drop a leading space the field already supplies" half. The inverse case — when the chunk arrives without a leading space (model omitted it, or the normalizer stripped it against a stale prefix snapshot) and the field also doesn't end in whitespace — fell through and accept rendered "Hello"+"World" as "HelloWorld" (issue #324). `SuggestionSessionReconciler.insertionChunk` now prepends a single space when the live preceding text ends in a word character and the chunk starts in a word character. The strip branch's `drop(while:)` predicate also moves to the same horizontal-whitespace definition the guard already uses, so a chunk that legitimately starts with a newline is no longer silently dropped when the field happens to end in a space or tab. Session accounting: rule 1 still advances by the untrimmed acceptedChunk; rule 2 types one character the session doesn't account for, so post-insertion reconciliation falls into the existing tolerate window. A user who then types the next suggestion char will fall out of tolerate and invalidate the session — acceptable, since the next prediction picks up from the corrected field state without any visible glue.
| func test_insertionChunk_addsBoundarySpaceWhenChunkAndFieldWouldGlue() { | ||
| // The reported "no space" regression: the chunk has no leading whitespace (model omitted it, | ||
| // or the normalizer stripped it against a stale prefix snapshot) and the field doesn't have | ||
| // a trailing one either, so we synthesize the word boundary at insert time. | ||
| XCTAssertEqual( | ||
| SuggestionSessionReconciler.insertionChunk(forAcceptedChunk: "World", precedingText: "Hello"), | ||
| " World" | ||
| ) | ||
| } |
There was a problem hiding this comment.
Missing end-to-end tolerate-path test after Rule 2 fires
The docstring explains that Rule 2 types one character the session doesn't account for, so the post-insertion reconciler enters a tolerate loop that persists until the session is explicitly invalidated. Because reconcileConsumedSuggestionText returns early (tolerate) before reaching the sentinel-clearing block at lines 96–101, pendingInsertionConsumedCount is never set to nil, and isAwaitingInsertedTextSync stays true for every subsequent reconcile cycle. There is no test that exercises this: accept a chunk via Rule 2, then calls reconcile to confirm the session remains .valid, then accepts a second chunk and reconciles again. Without it, the "session stays alive for follow-up word-by-word accepts" guarantee is purely documented, not verified.
| func test_insertionChunk_addsBoundarySpaceWhenChunkAndFieldWouldGlue() { | ||
| // The reported "no space" regression: the chunk has no leading whitespace (model omitted it, | ||
| // or the normalizer stripped it against a stale prefix snapshot) and the field doesn't have | ||
| // a trailing one either, so we synthesize the word boundary at insert time. | ||
| XCTAssertEqual( | ||
| SuggestionSessionReconciler.insertionChunk(forAcceptedChunk: "World", precedingText: "Hello"), | ||
| " World" | ||
| ) | ||
| } |
There was a problem hiding this comment.
No test for the partial-word-completion false-positive
The PR description explicitly calls out that "wor" + "ld" inserts as "wor ld" rather than "world" because Rule 2 fires on any word-char/word-char boundary. This tradeoff is documented in the docstring, but there is no test that confirms this exact scenario. Adding one (expected: " ld") would lock in the decision and serve as a reminder that this case is intentional rather than an oversight if someone revisits the heuristic later.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
PR #303 added an accept-time word-boundary reconciler that drops a chunk's leading whitespace when the live preceding text already ends in whitespace, fixing double-space-on-accept. It only covered the strip half: when the chunk arrived without a leading space (model omitted one, or the normalizer stripped it against a stale prefix snapshot) and the field didn't end in whitespace either, accept rendered
"Hello"+"World"as"HelloWorld"(issue #324). The strip branch'sdrop(while:)predicate also usedCharacter.isWhitespacewhile the guard usedCharacterSet.whitespaces— Greptile flagged this on the original PR — so a chunk that legitimately started with a newline could be silently dropped.This adds the inverse rule and tightens the drop predicate in
SuggestionSessionReconciler.insertionChunk:drop(while:)now mirrors the guard's horizontal-whitespace definition, so a chunk whose first character is a newline survives even when the field ends in a space or tab.Session accounting: rule 1 still advances by the untrimmed
acceptedChunk(the whitespace we skip typing is the field's own). Rule 2 types one character the session doesn't account for —fullTexthas no leading space to consume — so post-insertion reconciliation falls into the existing tolerate path and the session stays alive for follow-up word-by-word accepts. A user who then types a character that would have typed-matched the next suggestion char will fall out of tolerate and invalidate the session, which is acceptable: the next prediction picks up from the corrected field state without any visible glue.Validation
Eight new unit tests cover the add-space path and the exclusions: basic
"Hello"+"World"→" World", lowercase new word ("the"+"world"), digit/word boundary ("123"+"abc","Hello"+"1st"), punctuation-leading chunks (.,'s,, more), opening punctuation in the prefix ("Hello ("), newline boundary in the prefix, empty preceding text, and the strip-but-not-newline case ("first "+"\nnext"). The original seveninsertionChunktests from PR #303 still pass unchanged. App-hostedtestonly runs withCODE_SIGNING_ALLOWED=NOhere (local Team ID mismatch, perCLAUDE.md).Linked issues
Fixes #324
Risk / rollout notes
"wor"+"ld"will insert as"wor ld"rather than"world". Cotabby's prompt biases the model toward multi-word continuations rather than single-word completions, so the "new word" interpretation matches what users actually see; this tradeoff is documented in the helper's docstring.SuggestionSessionReconciler.reconcile, so single-Tab accepts and Tab-then-Tab sequences behave normally. A user who types a character matching the next expected suggestion char immediately after a space-added Tab will fall out of tolerate and the session will be invalidated; a fresh prediction takes over without visible glue.\nafter a space-or-tab field tail is now preserved instead of silently dropped (Greptile's review comment on Fix double space on accept by reconciling the word boundary at insert time #303).Greptile Summary
This PR fixes the inverse of the double-space bug addressed in #303: when neither the field nor the suggestion chunk carries a word boundary, acceptance was silently gluing adjacent words (
"Hello"+"World"→"HelloWorld"). It also tightens the existing strip predicate so a chunk whose first character is a newline is no longer silently dropped when the field ends in a space or tab.insertionChunk: ifprecedingTextends in a word character (isLetter || isNumber) andchunkstarts in a word character, a single space is prepended before insertion. The session advances by the original untrimmed chunk, relying on the existing tolerate window to absorb the extra injected character.drop(while:)closure now usesallSatisfy(CharacterSet.whitespaces.contains)instead ofCharacter.isWhitespace, aligning it with the horizontal-onlyCharacterSet.whitespacesguard so newline-leading chunks survive Rule 1.Confidence Score: 4/5
Safe to merge; the fix is well-scoped to a single pure function with no effect on the session model, coordinator, or generation-time normalization.
The two rules in
insertionChunkare mutually exclusive, logically correct, and covered by 15 targeted unit tests. The strip-predicate change is a clean correctness fix. The main unverified path is the reconciler's tolerate-loop behavior after Rule 2 fires (a follow-up Tab-then-Tab accept sequence), which is documented in prose but has no end-to-end test. The acknowledged partial-completion tradeoff also lacks a pinning test. Neither gap can produce incorrect output under normal use, but both are worth closing before the pattern is extended.The tolerate-path integration in
SuggestionSessionReconciler.reconciledeserves a second look alongside the new Rule 2 path — specifically the sentinel-clearing block at lines 96–101, which is bypassed wheneverreconcileConsumedSuggestionTextreturns early.Important Files Changed
insertionChunkto add Rule 2 (synthesize a boundary space when both the field and chunk end/start with word characters), and fixes the Rule 1 strip predicate to useCharacterSet.whitespaces(allSatisfy) instead ofisWhitespace, preserving newlines correctly.insertionChunktests pass unchanged.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["insertionChunk(chunk, precedingText)"] --> B{"precedingText last scalar\nin CharacterSet.whitespaces?"} B -- Yes --> C["drop(while: allSatisfy\nCharacterSet.whitespaces)"] C --> D["Return stripped chunk\n(newline-leading chunks preserved)"] B -- No --> E{"chunk.first\nisAcceptanceWordCharacter?"} E -- No --> F["Return chunk unchanged\n(punctuation-leading chunk hugs prior word)"] E -- Yes --> G{"precedingText.last\nisAcceptanceWordCharacter?"} G -- No --> F G -- Yes --> H["Return ' ' + chunk\n(Rule 2: synthesize boundary space)"]Reviews (1): Last reviewed commit: "Synthesize the boundary space when accep..." | Re-trigger Greptile