Skip to content

Fast paths for ASCII#42

Merged
clipperhouse merged 5 commits intomasterfrom
ascii-optimization
Jan 31, 2026
Merged

Fast paths for ASCII#42
clipperhouse merged 5 commits intomasterfrom
ascii-optimization

Conversation

@clipperhouse
Copy link
Copy Markdown
Owner

@clipperhouse clipperhouse commented Jan 20, 2026

graphemes

Printable ASCII bytes are their own graphemes, no need to call the real splitFunc for those. Gotta check the next byte to ensure it’s not a combining mark or something where the real grapheme logic would legitimately join it. To do this, graphemes gets its own iterator.

Looks like a 20% perf improvement for multilingual text and 3x improvement for pure ASCII.

words

Apply similar to words, runs of adjacent ASCI alphanumeric followed by ASCII space. Looks like 2x for pure ASCII, around 5% for multilingual.

phrases

ASCII optimization for runs of alphanumeric or space.

sentences

ASCII optimization for runs of alphanumeric or space.

Also ASCII optimizations for First methods in all packages, with tests.

Copilot AI review requested due to automatic review settings January 20, 2026 01:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an ASCII fast path optimization to grapheme cluster iteration by inlining the iterator implementation directly into the graphemes package. Printable ASCII bytes (0x20-0x7E) are treated as their own graphemes when not followed by non-ASCII bytes, avoiding the overhead of full Unicode grapheme cluster parsing for common cases.

Changes:

  • Inlined iterator implementation with ASCII hot path optimization in graphemes/iterator.go
  • Added ASCII-specific benchmark to measure optimization effectiveness
  • Bumped minimum Go version from 1.18 to 1.20

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
graphemes/iterator.go Replaced dependency on internal/iterators with inline implementation featuring ASCII fast path for printable ASCII characters
graphemes/comparative/comparative_test.go Added BenchmarkGraphemesASCII to measure performance on pure ASCII text, improved existing benchmark structure
go.mod Bumped minimum Go version requirement from 1.18 to 1.20

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@clipperhouse clipperhouse merged commit 7f9e9f4 into master Jan 31, 2026
20 checks passed
@clipperhouse clipperhouse deleted the ascii-optimization branch January 31, 2026 16:04
@clipperhouse clipperhouse changed the title Fast path for ASCII Fast paths for ASCII Jan 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants