Tighten token-cap completion-length budgets (#271)#283
Merged
Conversation
38d0a93 to
a516ece
Compare
Completion length is governed solely by suggestedPredictionTokenBudget on the local model (the in-prompt word-range cue stays removed). The 50% bump to 17/27/45 let completions overrun the user's word-count setting — ~12 words on the 3-7 preset, often spilling multi-line (#271). Return the caps to 11/18/30 (~1.5x the upper word bound) so the token limit lands close to the selected range.
a516ece to
e2c78a9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completion length runs past the user's word-count setting because, with the token-cap-only experiment (#251),
suggestedPredictionTokenBudgetis the sole governor of length on the local model — and that experiment also bumped it 50% (11/18/30 → 17/27/45). The looser cap lets the model emit ~12 words on a 3-7 setting and spill onto multiple lines (#271). This keeps the token-cap-only design (the in-prompt word-range cue stays removed on both engines) and just returns the caps to 11/18/30 (~1.5× the upper word bound) so the limit lands close to the selected range.Validation
App-hosted test execution fails locally only on the known Team ID code-signing mismatch (
CotabbyTestsbundle won't load); the updated budget assertions compile underbuild-for-testing.Linked issues
Fixes #271
Risk / rollout notes
suggestedPredictionTokenBudget; the token-cap-only behavior from Stabilize ghost-text size, token-budget completion length, tabby model lineup, full-accept reset #251 is unchanged (no completion-length cue is reintroduced into either prompt).