Skip to content

Synthesize boundary space when accept would glue the suggestion onto the user's last word#348

Merged
FuJacob merged 1 commit into
mainfrom
fix/accept-missing-space
May 28, 2026
Merged

Synthesize boundary space when accept would glue the suggestion onto the user's last word#348
FuJacob merged 1 commit into
mainfrom
fix/accept-missing-space

Conversation

@FuJacob

@FuJacob FuJacob commented May 28, 2026

Copy link
Copy Markdown
Owner

Summary

PR #303 added an accept-time word-boundary reconciler that drops a chunk's leading whitespace when the live preceding text already ends in whitespace, fixing double-space-on-accept. It only covered the strip half: when the chunk arrived without a leading space (model omitted one, or the normalizer stripped it against a stale prefix snapshot) and the field didn't end in whitespace either, accept rendered "Hello"+"World" as "HelloWorld" (issue #324). The strip branch's drop(while:) predicate also used Character.isWhitespace while the guard used CharacterSet.whitespaces — Greptile flagged this on the original PR — so a chunk that legitimately started with a newline could be silently dropped.

This adds the inverse rule and tightens the drop predicate in SuggestionSessionReconciler.insertionChunk:

  • If the live preceding text ends in a word character and the chunk starts in a word character (no leading whitespace), prepend a single space so accept never glues the suggestion onto the user's last word.
  • The strip branch's drop(while:) now mirrors the guard's horizontal-whitespace definition, so a chunk whose first character is a newline survives even when the field ends in a space or tab.

Session accounting: rule 1 still advances by the untrimmed acceptedChunk (the whitespace we skip typing is the field's own). Rule 2 types one character the session doesn't account for — fullText has no leading space to consume — so post-insertion reconciliation falls into the existing tolerate path and the session stays alive for follow-up word-by-word accepts. A user who then types a character that would have typed-matched the next suggestion char will fall out of tolerate and invalidate the session, which is acceptable: the next prediction picks up from the corrected field state without any visible glue.

Validation

xcodebuild build -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS'
# ** BUILD SUCCEEDED **
xcodebuild build-for-testing -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS'
# ** TEST BUILD SUCCEEDED **
xcodebuild test -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' \
  -only-testing:CotabbyTests/SuggestionSessionReconcilerTests \
  -only-testing:CotabbyTests/SuggestionTextNormalizerTests \
  -only-testing:CotabbyTests/SuggestionInteractionStateTests \
  CODE_SIGNING_ALLOWED=NO
# ** TEST SUCCEEDED **  43 reconciler tests + normalizer + interaction state, 0 failures
swiftlint lint --quiet Cotabby/Support/SuggestionSessionReconciler.swift CotabbyTests/SuggestionSessionReconcilerTests.swift
# exit 0

Eight new unit tests cover the add-space path and the exclusions: basic "Hello"+"World"" World", lowercase new word ("the"+"world"), digit/word boundary ("123"+"abc", "Hello"+"1st"), punctuation-leading chunks (., 's, , more), opening punctuation in the prefix ("Hello ("), newline boundary in the prefix, empty preceding text, and the strip-but-not-newline case ("first "+"\nnext"). The original seven insertionChunk tests from PR #303 still pass unchanged. App-hosted test only runs with CODE_SIGNING_ALLOWED=NO here (local Team ID mismatch, per CLAUDE.md).

Linked issues

Fixes #324

Risk / rollout notes

  • This is intentionally aggressive: rule 2 fires on any word-char/word-char boundary, which means a partial-word completion suggestion like "wor" + "ld" will insert as "wor ld" rather than "world". Cotabby's prompt biases the model toward multi-word continuations rather than single-word completions, so the "new word" interpretation matches what users actually see; this tradeoff is documented in the helper's docstring.
  • The added space rides the existing post-insertion tolerate window in SuggestionSessionReconciler.reconcile, so single-Tab accepts and Tab-then-Tab sequences behave normally. A user who types a character matching the next expected suggestion char immediately after a space-added Tab will fall out of tolerate and the session will be invalidated; a fresh prediction takes over without visible glue.
  • The strip-predicate tightening means a chunk starting with \n after a space-or-tab field tail is now preserved instead of silently dropped (Greptile's review comment on Fix double space on accept by reconciling the word boundary at insert time #303).
  • No changes to the session model, the coordinator, or generation-time normalization.

Greptile Summary

This PR fixes the inverse of the double-space bug addressed in #303: when neither the field nor the suggestion chunk carries a word boundary, acceptance was silently gluing adjacent words ("Hello"+"World""HelloWorld"). It also tightens the existing strip predicate so a chunk whose first character is a newline is no longer silently dropped when the field ends in a space or tab.

  • Rule 2 added to insertionChunk: if precedingText ends in a word character (isLetter || isNumber) and chunk starts in a word character, a single space is prepended before insertion. The session advances by the original untrimmed chunk, relying on the existing tolerate window to absorb the extra injected character.
  • Strip-predicate tightened: the drop(while:) closure now uses allSatisfy(CharacterSet.whitespaces.contains) instead of Character.isWhitespace, aligning it with the horizontal-only CharacterSet.whitespaces guard so newline-leading chunks survive Rule 1.
  • Eight new unit tests cover the add-space path and all documented exclusions (punctuation-leading chunks, post-newline/post-parenthesis preceding text, empty field, and the strip-but-not-newline case).

Confidence Score: 4/5

Safe to merge; the fix is well-scoped to a single pure function with no effect on the session model, coordinator, or generation-time normalization.

The two rules in insertionChunk are mutually exclusive, logically correct, and covered by 15 targeted unit tests. The strip-predicate change is a clean correctness fix. The main unverified path is the reconciler's tolerate-loop behavior after Rule 2 fires (a follow-up Tab-then-Tab accept sequence), which is documented in prose but has no end-to-end test. The acknowledged partial-completion tradeoff also lacks a pinning test. Neither gap can produce incorrect output under normal use, but both are worth closing before the pattern is extended.

The tolerate-path integration in SuggestionSessionReconciler.reconcile deserves a second look alongside the new Rule 2 path — specifically the sentinel-clearing block at lines 96–101, which is bypassed whenever reconcileConsumedSuggestionText returns early.

Important Files Changed

Filename Overview
Cotabby/Support/SuggestionSessionReconciler.swift Refactors insertionChunk to add Rule 2 (synthesize a boundary space when both the field and chunk end/start with word characters), and fixes the Rule 1 strip predicate to use CharacterSet.whitespaces (allSatisfy) instead of isWhitespace, preserving newlines correctly.
CotabbyTests/SuggestionSessionReconcilerTests.swift Adds 8 new unit tests covering Rule 2's add-space path and its exclusions (punctuation-leading chunks, post-newline preceding text, empty preceding text, and the strip-but-not-newline case); all 7 original insertionChunk tests pass unchanged.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["insertionChunk(chunk, precedingText)"] --> B{"precedingText last scalar\nin CharacterSet.whitespaces?"}
    B -- Yes --> C["drop(while: allSatisfy\nCharacterSet.whitespaces)"]
    C --> D["Return stripped chunk\n(newline-leading chunks preserved)"]
    B -- No --> E{"chunk.first\nisAcceptanceWordCharacter?"}
    E -- No --> F["Return chunk unchanged\n(punctuation-leading chunk hugs prior word)"]
    E -- Yes --> G{"precedingText.last\nisAcceptanceWordCharacter?"}
    G -- No --> F
    G -- Yes --> H["Return ' ' + chunk\n(Rule 2: synthesize boundary space)"]
Loading

Fix All in Codex Fix All in Claude Code

Reviews (1): Last reviewed commit: "Synthesize the boundary space when accep..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

…nto the user's last word

PR #303 moved the word-boundary decision to accept-time but only added the
"drop a leading space the field already supplies" half. The inverse case —
when the chunk arrives without a leading space (model omitted it, or the
normalizer stripped it against a stale prefix snapshot) and the field also
doesn't end in whitespace — fell through and accept rendered "Hello"+"World"
as "HelloWorld" (issue #324).

`SuggestionSessionReconciler.insertionChunk` now prepends a single space when
the live preceding text ends in a word character and the chunk starts in a
word character. The strip branch's `drop(while:)` predicate also moves to the
same horizontal-whitespace definition the guard already uses, so a chunk that
legitimately starts with a newline is no longer silently dropped when the
field happens to end in a space or tab.

Session accounting: rule 1 still advances by the untrimmed acceptedChunk;
rule 2 types one character the session doesn't account for, so post-insertion
reconciliation falls into the existing tolerate window. A user who then types
the next suggestion char will fall out of tolerate and invalidate the session
— acceptable, since the next prediction picks up from the corrected field
state without any visible glue.
@FuJacob FuJacob merged commit 8c10979 into main May 28, 2026
4 checks passed
Comment on lines +192 to +200
func test_insertionChunk_addsBoundarySpaceWhenChunkAndFieldWouldGlue() {
// The reported "no space" regression: the chunk has no leading whitespace (model omitted it,
// or the normalizer stripped it against a stale prefix snapshot) and the field doesn't have
// a trailing one either, so we synthesize the word boundary at insert time.
XCTAssertEqual(
SuggestionSessionReconciler.insertionChunk(forAcceptedChunk: "World", precedingText: "Hello"),
" World"
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing end-to-end tolerate-path test after Rule 2 fires

The docstring explains that Rule 2 types one character the session doesn't account for, so the post-insertion reconciler enters a tolerate loop that persists until the session is explicitly invalidated. Because reconcileConsumedSuggestionText returns early (tolerate) before reaching the sentinel-clearing block at lines 96–101, pendingInsertionConsumedCount is never set to nil, and isAwaitingInsertedTextSync stays true for every subsequent reconcile cycle. There is no test that exercises this: accept a chunk via Rule 2, then calls reconcile to confirm the session remains .valid, then accepts a second chunk and reconciles again. Without it, the "session stays alive for follow-up word-by-word accepts" guarantee is purely documented, not verified.

Fix in Codex Fix in Claude Code

Comment on lines +192 to +200
func test_insertionChunk_addsBoundarySpaceWhenChunkAndFieldWouldGlue() {
// The reported "no space" regression: the chunk has no leading whitespace (model omitted it,
// or the normalizer stripped it against a stale prefix snapshot) and the field doesn't have
// a trailing one either, so we synthesize the word boundary at insert time.
XCTAssertEqual(
SuggestionSessionReconciler.insertionChunk(forAcceptedChunk: "World", precedingText: "Hello"),
" World"
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No test for the partial-word-completion false-positive

The PR description explicitly calls out that "wor" + "ld" inserts as "wor ld" rather than "world" because Rule 2 fires on any word-char/word-char boundary. This tradeoff is documented in the docstring, but there is no test that confirms this exact scenario. Adding one (expected: " ld") would lock in the decision and serve as a reminder that this case is intentional rather than an oversight if someone revisits the heuristic later.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] No Space added on first shift

1 participant