Unify token counting between `check` and `tokens count` by chlowell · Pull Request #146 · microsoft/waza

chlowell · 2026-03-19T15:29:03Z

waza check (Token Budgets section) and waza tokens count reported different token counts for the same file because check uses the Skill.Tokens value set by skill.UnmarshalText() as a simple multiple of the input's character count, whereas waza tokens count uses the BPE tokenizer. With this PR, skill.UnmarshalText() uses the BPE tokenizer via tokens.DefaultCounter(), making Skill.Tokens more accurate for all consumers and consistent with tokens count output.

While I was at it, I also stopped normalizing line endings--replacing \r\n with \n--before tokenizing because the BPE algorithm encodes these sequences as a different number of tokens. I added a test helper to prevent this change breaking tests on certain platforms.

Copilot

Pull request overview

This PR makes token counting consistent across waza check and waza tokens count by switching Skill.UnmarshalText() to use the shared BPE tokenizer counter and avoiding line-ending normalization that changes BPE tokenization results.

Changes:

Introduce tokens.DefaultCounter() (init-once shared BPE counter) and tokens.CountLines() utility.
Update skill parsing and checks/CLI token counting to use BPE counting on raw content (no CRLF→LF normalization).
Add/adjust tests for line counting, shared counter behavior, and LRU cache concurrency.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
internal/tokens/tokens.go	Adds `CountLines` and shared `DefaultCounter()` initialization.
internal/tokens/tokens_test.go	Adds test coverage for `CountLines`.
internal/skill/skill.go	Uses shared BPE counter during `UnmarshalText()` token calculation.
internal/checks/token_limits.go	Stops CRLF normalization before token counting.
internal/checks/token_budget_test.go	Adds test asserting token budget uses shared BPE counting on raw content (LF/CRLF).
cmd/waza/tokens/helpers.go	Removes local `countLines` helper (replaced by `tokens.CountLines`).
cmd/waza/tokens/count.go	Stops CRLF normalization and uses `tokens.CountLines`.
cmd/waza/tokens/compare.go	Uses `tokens.CountLines` for line counting.
internal/tokens/bpe/lru.go	Makes LRU cache operations mutex-protected for concurrent access.
internal/tokens/bpe/lru_test.go	Adds a concurrent-access test for the LRU cache.
cmd/waza/dev/helpers_test.go	Adds output-matching helper that masks token counts while still verifying initial token count.
cmd/waza/dev/loop_test.go	Updates dev loop tests to use the new output-matching helper and initial-skill token counts.
cmd/waza/cmd_check_test.go	Adds integration test ensuring `check` token budget matches `tokens count` for the same file.

Copilot

Pull request overview

Aligns token counting across waza check and waza tokens count by switching skill token computation to the shared BPE tokenizer and preserving raw line endings during tokenization to match BPE behavior.

Changes:

Use a shared default BPE counter (tokens.DefaultCounter) for consistent token counts.
Stop CRLF→LF normalization before tokenizing; centralize line counting via tokens.CountLines.
Add/update tests and introduce testutil.StripTokenCounts to make CLI output assertions robust across platforms/tokenization changes.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
internal/tokens/tokens.go	Adds `CountLines` and introduces a shared default counter used by `NewCounter`.
internal/tokens/tokens_test.go	Adds `CountLines` unit tests.
internal/tokens/bpe/lru.go	Makes LRU cache concurrency-safe via a mutex.
internal/tokens/bpe/lru_test.go	Adds a concurrent access test for the LRU cache.
internal/testutil/tokenstrip.go	Adds helper to mask token-count numbers in CLI output tests.
internal/skill/skill.go	Switches `Skill.Tokens` to BPE-based counting and uses `tokens.CountLines`.
internal/checks/token_limits.go	Stops normalizing CRLF to LF before counting tokens.
internal/checks/token_budget_test.go	Adds coverage ensuring token budgeting uses shared default tokenizer and raw content.
cmd/waza/tokens/helpers.go	Removes local `countLines` helper (replaced by `tokens.CountLines`).
cmd/waza/tokens/count.go	Stops normalizing CRLF to LF and uses `tokens.CountLines`.
cmd/waza/tokens/compare.go	Uses `tokens.CountLines` for line counts.
cmd/waza/tokens/count_test.go	Masks token counts in output comparisons.
cmd/waza/tokens/check_test.go	Masks token counts in output comparisons.
cmd/waza/dev/helpers_test.go	Adds `requireOutputMatch` to mask token counts but still validate the displayed token count.
cmd/waza/dev/loop_test.go	Updates assertions to use `requireOutputMatch` and validates token banner value against parsed skill tokens.
cmd/waza/cmd_check_test.go	Adds an integration-ish test ensuring `check` and `tokens count` agree for the same file.

codecov-commenter · 2026-03-19T18:00:23Z

Codecov Report

❌ Patch coverage is 80.64516% with 6 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@b07221f). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
internal/testutil/tokenstrip.go	0.00%	4 Missing ⚠️
internal/skill/skill.go	60.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #146   +/-   ##
=======================================
  Coverage        ?   73.52%           
=======================================
  Files           ?      141           
  Lines           ?    15858           
  Branches        ?        0           
=======================================
  Hits            ?    11660           
  Misses          ?     3351           
  Partials        ?      847

Flag	Coverage Δ
go-implementation	`73.52% <80.64%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

Aligns token counting across waza check and waza tokens count by switching Skill.Tokens to use the BPE tokenizer (and preserving raw line endings) so both commands report consistent counts.

Changes:

Introduces a shared default BPE token counter and reuses it across call sites.
Stops normalizing CRLF to LF before tokenization; updates line counting to a shared helper.
Updates/strengthens tests to be resilient to platform line endings and token-count variability in formatted output.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
internal/tokens/tokens.go	Adds `CountLines` and a shared `DefaultCounter`; routes `NewCounter` through the shared default for BPE.
internal/tokens/tokens_test.go	Adds unit tests for `CountLines` including CRLF inputs.
internal/tokens/bpe/lru.go	Makes LRU cache protected by a mutex for concurrent access.
internal/tokens/bpe/lru_test.go	Adds a concurrency test for the LRU implementation.
internal/testutil/tokenstrip.go	Adds helper to mask token-related numbers in CLI output assertions.
internal/skill/skill.go	Switches `Skill.Tokens` to BPE counting and uses shared `CountLines`.
internal/checks/token_limits.go	Stops normalizing line endings prior to token counting.
internal/checks/token_budget_test.go	Adds regression test to ensure token budget uses shared tokenizer across LF/CRLF raw content.
cmd/waza/tokens/helpers.go	Removes local `countLines` helper (now using `tokens.CountLines`).
cmd/waza/tokens/count.go	Stops CRLF normalization; uses `tokens.CountLines`.
cmd/waza/tokens/compare.go	Uses `tokens.CountLines` for line deltas.
cmd/waza/tokens/count_test.go	Computes expected token counts via `DefaultCounter` and masks token counts for table output.
cmd/waza/tokens/check_test.go	Masks token counts in output comparisons for stability.
cmd/waza/dev/loop_test.go	Masks token counts and validates displayed token count vs initial skill’s BPE tokens.
cmd/waza/dev/helpers_test.go	Adds `requireOutputMatch` helper using token masking + first token-count validation.
cmd/waza/cmd_check_test.go	Adds end-to-end test that `check` and `tokens count` agree for the same skill file.

richardpark-msft

One request - let's file an issue to move away from having to screen-scrape our own output, but it's fine for now and at last we're consolidating, so it's still an improvement.

Copilot AI review requested due to automatic review settings March 19, 2026 15:29

Copilot AI reviewed Mar 19, 2026

View reviewed changes

Comment thread internal/tokens/bpe/lru_test.go

Comment thread internal/tokens/bpe/lru.go

Comment thread internal/skill/skill.go Outdated

Comment thread internal/skill/skill.go Outdated

Comment thread cmd/waza/dev/helpers_test.go Outdated

chlowell force-pushed the token-counts branch 2 times, most recently from ff20c2c to 7898cb1 Compare March 19, 2026 17:28

Copilot AI review requested due to automatic review settings March 19, 2026 17:28

Copilot AI reviewed Mar 19, 2026

View reviewed changes

Comment thread internal/tokens/tokens.go

Comment thread internal/tokens/bpe/lru.go

Comment thread internal/tokens/bpe/lru.go

Comment thread internal/tokens/bpe/lru_test.go

Comment thread cmd/waza/dev/helpers_test.go

chlowell force-pushed the token-counts branch from 7898cb1 to 5547543 Compare March 19, 2026 18:02

Copilot started reviewing on behalf of chlowell March 19, 2026 18:16 View session

Unify token counting between check and tokens count

2d5e5b1

chlowell force-pushed the token-counts branch from 5547543 to 2d5e5b1 Compare March 19, 2026 18:29

Copilot AI review requested due to automatic review settings March 19, 2026 18:29

Copilot AI reviewed Mar 19, 2026

View reviewed changes

Comment thread internal/tokens/bpe/lru_test.go

Comment thread internal/tokens/bpe/lru.go

Comment thread internal/tokens/bpe/lru.go

Comment thread internal/tokens/tokens.go

Comment thread internal/skill/skill.go

chlowell marked this pull request as ready for review March 19, 2026 18:59

chlowell requested review from richardpark-msft, spboyer and wbreza as code owners March 19, 2026 18:59

chlowell enabled auto-merge (squash) March 19, 2026 19:08

Copilot started reviewing on behalf of chlowell March 19, 2026 19:13 View session

richardpark-msft previously approved these changes Mar 19, 2026

View reviewed changes

Comment thread internal/testutil/tokenstrip.go

Comment thread cmd/waza/cmd_check_test.go Outdated

Comment thread internal/tokens/tokens.go

export count output types

bf54000

chlowell dismissed richardpark-msft’s stale review via bf54000 March 19, 2026 20:50

chlowell requested a review from richardpark-msft March 19, 2026 20:50

richardpark-msft approved these changes Mar 19, 2026

View reviewed changes

chlowell merged commit 52e123f into microsoft:main Mar 19, 2026
11 checks passed

chlowell deleted the token-counts branch March 19, 2026 21:02

spboyer mentioned this pull request Mar 26, 2026

Release v0.24.0 #161

Merged

4 tasks

Uh oh!

Conversation

chlowell commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richardpark-msft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 19, 2026 •

edited

Loading