Stop the constrained decoder at sentence boundaries by FuJacob · Pull Request #517 · FuJacob/cotabby

FuJacob · 2026-06-02T01:04:12Z

Summary

Adds a sentence-boundary stop to the deterministic constrained decoder so a completion ends cleanly
at the end of one sentence instead of running into the next. After committing a token that carried an
ASCII sentence terminator, the decoder decodes the accumulated bytes and stops when they end a
sentence (a terminal ., !, or ?, allowing a trailing run of closing quotes/brackets). The
period case reuses the existing SentenceBoundaryClassifier disambiguation, so decimals ("1.2"),
list numbers ("1."), single-letter initials ("U.S."), and abbreviations ("e.g.", "Mr.") do not stop
generation early.

This affects only the constrained decode path (the cotabbyConstrainedDecoderEnabled developer
flag, default off); the shipping sampler path is unchanged.

Validation

xcodebuild ... test ... CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO \
  -only-testing:CotabbyTests/SentenceBoundaryClassifierTests \
  -only-testing:CotabbyTests/SuggestionSessionReconcilerTests
# ** TEST SUCCEEDED ** — incl. 8 new endsSentence cases (terminal vs abbreviation/decimal/initial,
#   closing-punctuation walk-back, trailing whitespace, empty string)

swiftlint --strict --quiet   # exit 0 (kept runConstrainedDecode under the cyclomatic-complexity
                             # budget by folding the pre-commit stop checks into one helper)

Linked issues

None. Brings the constrained decoder closer to the intended "predict a short continuation, stop at a
sentence boundary" behavior.

Risk / rollout notes

Constrained path only, which is default-off, so there is no change to shipped suggestion
behavior. It is a building block toward making that path good enough to enable by default (which
still needs on-device evaluation).
The boundary decode runs only when a token carried an ASCII ./!/?, so the steady decode path
stays on raw byte accumulation with a single decode at the end. Non-ASCII terminators (CJK 。！？)
are not yet handled and fall through to the existing token-budget stop.

Greptile Summary

Adds sentence-boundary stopping to the constrained decoder (cotabbyConstrainedDecoderEnabled dev flag, default off). After each committed token, if its bytes contain an ASCII terminator (./!/?), the accumulated completion is decoded and SentenceBoundaryClassifier.endsSentence is consulted; the loop breaks with stopReason = \"sentence_boundary\" when a true sentence end is detected. The PR also refactors the pre-commit EOS/newline checks into a single preCommitStopReason helper to stay under the cyclomatic-complexity budget.

SentenceBoundaryClassifier.endsSentence walks back past trailing whitespace and closing punctuation (quotes, brackets) before checking the terminal character, correctly delegating period disambiguation to the existing isTerminalPeriod logic so decimals, initials, and abbreviations do not trigger early stops.
The constrained path is guarded by a developer flag and is not active in shipped builds, so there is no impact on production suggestion behaviour.

Confidence Score: 4/5

Safe to merge — the new logic is gated behind a default-off developer flag and does not touch the shipped sampler path.

The SentenceBoundaryClassifier changes are thoroughly unit-tested and the walk-back logic is correct. The main behavioural gap is in LlamaRuntimeCore: when a tokenizer emits a sentence terminator and a closing quotation mark in consecutive separate tokens, the sentence-boundary check fires after the terminator token and the closing quote is never appended, producing an unclosed quote. This is specific to tokenizer splitting behaviour and affects only the dev-flag constrained path, but it is a real output-quality issue worth addressing before the path is enabled by default.

The token-split edge case is isolated to LlamaRuntimeCore.swift around the completesSentence call site at line 381.

Important Files Changed

Filename	Overview
Cotabby/Services/Runtime/LlamaRuntimeCore.swift	Adds completesSentence, preCommitStopReason, and isSentenceTerminatorByte helpers to stop the constrained decode loop at sentence boundaries; refactors pre-commit EOS/newline checks into a single helper. Logic is correct but has a token-boundary edge case when . and a trailing closing quote arrive in separate tokens.
Cotabby/Support/SentenceBoundaryClassifier.swift	Adds endsSentence and the isSentenceClosingPunctuation Character extension. Walk-back logic (whitespace → closing punctuation → terminator) is sound, and isTerminalPeriod delegation correctly handles decimals, initials, and abbreviations.
CotabbyTests/SentenceBoundaryClassifierTests.swift	Adds 8 well-chosen endsSentence unit tests covering terminal vs. non-terminal periods, !/?, trailing whitespace, closing-punctuation walk-back, and empty string. All expected behaviours are exercised.

Sequence Diagram

sequenceDiagram
    participant DecodeLoop as "Decode Loop"
    participant PreCommit as "preCommitStopReason"
    participant Accept as "engine.acceptToken"
    participant CS as "completesSentence"
    participant SBC as "SentenceBoundaryClassifier"

    DecodeLoop->>PreCommit: tokenID, options, profile
    PreCommit-->>DecodeLoop: nil or eos/single_line (break)
    DecodeLoop->>Accept: acceptToken(sequenceID, tokenID)
    Accept-->>DecodeLoop: ok or error (break)
    DecodeLoop->>CS: completesSentence(generatedBytes, lastTokenBytes)
    CS->>CS: guard lastTokenBytes contains terminator byte
    CS->>SBC: endsSentence(String decoding generatedBytes)
    SBC->>SBC: skip trailing whitespace
    SBC->>SBC: skip closing punctuation
    SBC->>SBC: inspect last visible character
    SBC-->>CS: true or false
    CS-->>DecodeLoop: true means stopReason sentence_boundary and break

_{Reviews (1): Last reviewed commit: "Stop the constrained decoder at sentence..." | Re-trigger Greptile}

Greptile also left 2 inline comments on this PR.

The deterministic constrained decoder ran to its token budget even after finishing a sentence, so it could spill into the next one. After committing a token carrying an ASCII '.'/'!'/'?', it now decodes the accumulated bytes and stops when they end a sentence, reusing SentenceBoundaryClassifier so decimals, list numbers, single-letter initials, and abbreviations don't stop early. Constrained path only (the default-off developer flag); the shipping sampler path is unchanged. Pre-commit stop checks were folded into one helper to keep the decode loop under the cyclomatic-complexity budget.

greptile-apps · 2026-06-02T01:14:23Z

+            if Self.completesSentence(generatedBytes, lastTokenBytes: tokenBytes) {
+                stopReason = "sentence_boundary"
+                break
+            }


Token-split may produce unclosed quotation marks

completesSentence evaluates generatedBytes immediately after the token that carries ., !, or ? is appended. If the tokenizer emits a closing quote as a separate subsequent token (e.g. token N = "stop.", token N+1 = '"'), the check fires after token N when generatedBytes ends with stop. — endsSentence returns true, the loop breaks, and the closing " is never appended. The walk-back logic in endsSentence only helps when the closing punctuation is already in the buffer. For "He said \"Good morning.\"" split across two tokens this would yield He said "Good morning. in the output. Consider deferring the sentence-end break by one additional token when the current check fires, to allow a potential closing-quote token to land first.

greptile-apps · 2026-06-02T01:14:23Z

+    private static func completesSentence(_ generatedBytes: [UInt8], lastTokenBytes: [UInt8]) -> Bool {
+        guard lastTokenBytes.contains(where: isSentenceTerminatorByte) else {
+            return false
+        }
+        // swiftlint:disable:next optional_data_string_conversion
+        let decoded = String(decoding: generatedBytes, as: UTF8.self)
+        return SentenceBoundaryClassifier.endsSentence(decoded)
+    }


Full-buffer decode on every terminator token

String(decoding: generatedBytes, as: UTF8.self) re-decodes the entire accumulated byte array every time a token carries a terminator byte. For the intended short completions under a small maxPredictionTokens budget this is negligible, but if the budget grows or a sentence contains multiple terminator bytes (e.g. U.S. followed by a real period), this is called multiple times and scales with total completion length. A suffix-only decode — e.g. keeping a rolling decodedSoFar: String updated incrementally, or only decoding the last ~50 bytes — would keep per-token cost constant.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

FuJacob merged commit 562bd98 into main Jun 2, 2026
4 checks passed

FuJacob deleted the feat/constrained-sentence-stop branch June 2, 2026 01:07

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

FuJacob mentioned this pull request Jun 2, 2026

Document how to evaluate the experimental decoder on device #524

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stop the constrained decoder at sentence boundaries#517

Stop the constrained decoder at sentence boundaries#517
FuJacob merged 1 commit into
mainfrom
feat/constrained-sentence-stop

FuJacob commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FuJacob commented Jun 2, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Linked issues

Risk / rollout notes

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FuJacob commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading