Skip to content

Reduce binary size: run-length encode feature support versions#692

Merged
Boshen merged 3 commits into
mainfrom
feature-run-encoding
May 29, 2026
Merged

Reduce binary size: run-length encode feature support versions#692
Boshen merged 3 commits into
mainfrom
feature-run-encoding

Conversation

@Boshen

@Boshen Boshen commented May 29, 2026

Copy link
Copy Markdown
Member

A feature's per-browser y/a list is the set of versions that support it, and browser support is almost always "from version N onward" — so in per-browser version order, ~94% of these lists are a single contiguous run (avg 1.06 runs vs 7.47 in the previous global-index order).

This stores each list as ascending (start, length) runs of local indices into a per-browser version order, instead of one index per version — collapsing ~245k version indices into ~16k run endpoints.

Size

  • caniuse_feature_matching blob 20,578 → 10,611 B; adds a 435 B per-browser version-order table (net feature data −9,532 B).
  • musl inspect example 770,944 → 764,160 B (−6,784).
  • macOS inspect example 635,392 → 618,880 B (−16,512) — the cumulative data reduction now crosses the 16 KB Mach-O page boundary, so the file size moves (README updated 620.5K → 604.4K).

Keeping the decoder small

A naive decode cancelled most of the win (the data shrank ~9.5 KB but .text grew ~7.8 KB), so:

  • The blob is hand-decoded with a small fixed varint reader instead of postcard's generic deserializer for the nested run type (which monomorphized into several KB). This also lets the now-unused generic decode helper be removed.
  • The short per-list index arrays are insertion-sorted rather than via sort_unstable, avoiding the large pattern-defeating-quicksort instantiation (which also inflated .eh_frame).

FeatureSet/supports() are unchanged — create_data() still yields lexicographically-sorted &'static str lists for binary search. Losslessness only needs the per-browser order to be consistent between the table and the run indices (the resolved version strings are re-sorted), so the exact ordering is irrelevant to correctness.

All tests and the JS-equivalence proptests (proptest_supports) pass.

🤖 Generated with Claude Code

Boshen added 2 commits May 29, 2026 21:36
A feature's per-browser `y`/`a` list is the set of versions that support it,
and browser support is almost always "from version N onward" — so in
per-browser version order each list is a single contiguous run. Store the lists
as `(start, length)` runs of local indices into a per-browser version order,
instead of one index per version, collapsing ~245k indices into ~16k run
endpoints.

- caniuse_feature_matching blob 20,578 -> 10,611 B; adds a 435 B per-browser
  version-order table; net feature data -9,532 B.
- Hand-decode the blob with a small fixed varint reader instead of postcard's
  generic deserializer for the nested run type, and insertion-sort the short
  lists rather than `sort_unstable`, to avoid several KB of monomorphized
  `.text`/`.eh_frame` that would otherwise cancel the win. Drops the now-unused
  generic `decode` helper.
- musl `inspect` example -6,784 B (rodata -9,536, text +752).

All tests and the JS-equivalence proptests pass.
macOS `inspect` example 635,392 -> 618,880 bytes; the cumulative data reduction now crosses the 16 KB Mach-O page boundary so the file size finally moves.
@codspeed-hq

codspeed-hq Bot commented May 29, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 6 untouched benchmarks


Comparing feature-run-encoding (32b12db) with main (f434ade)

Open in CodSpeed

@codecov

codecov Bot commented May 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.06%. Comparing base (f434ade) to head (32b12db).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #692      +/-   ##
==========================================
+ Coverage   99.04%   99.06%   +0.01%     
==========================================
  Files          47       47              
  Lines        2415     2456      +41     
==========================================
+ Hits         2392     2433      +41     
  Misses         23       23              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…tation

Cleanup from /simplify review (no behavior or output change — regenerated blobs are byte-identical):
- Drop the `v.clone()` tie-break from the per-browser version sort: the input is already lexicographically ordered and the sort is stable, so ties stay deterministic without it.
- Replace the verbose `Vec<(u8, Vec<(u16, u16)>, Vec<(u16, u16)>)>` annotation with `Vec<_>` (inferred), clearing a clippy type-complexity warning.
- Document that the per-browser numeric order is load-bearing for run contiguity, and that the postcard blob layout is hand-decoded by the runtime and must stay LEB128-compatible.
@Boshen Boshen merged commit 14a746a into main May 29, 2026
16 checks passed
@Boshen Boshen deleted the feature-run-encoding branch May 29, 2026 14:10
@oxc-guard oxc-guard Bot mentioned this pull request May 29, 2026
Boshen pushed a commit that referenced this pull request May 29, 2026
## 🤖 New release

* `oxc-browserslist`: 3.0.3 -> 3.0.4 (✓ API compatible changes)

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[3.0.4](oxc-browserslist-v3.0.3...oxc-browserslist-v3.0.4)
- 2026-05-29

### Other

- DRY up feature/region codegen with shared table + lookup helpers
([#694](#694))
- Consolidate bundled-data loading behind compression helpers
([#693](#693))
- Reduce binary size: run-length encode feature support versions
([#692](#692))
- Reduce binary size: Zopfli codegen compression + percentage byte-plane
([#691](#691))
- Reduce binary size: byte-plane (stream-split) compression for region
pair indices
([#690](#690))
- Update README binary size to 621K
([#689](#689))
- Reduce binary size of bundled caniuse/electron data
([#688](#688))
- Switch codegen data source from caniuse-db to caniuse-lite
([#687](#687))
- Update browserslist
([#685](#685))
- Update rust crates
([#682](#682))
- Update browserslist
([#679](#679))
- Update browserslist
([#678](#678))
- Update browserslist
([#677](#677))
- Update browserslist
([#673](#673))
- Update browserslist
([#671](#671))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: oxc-guard[bot] <276638029+oxc-guard[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant