[S1/S4] Training metrics (INFO compression) + integration tests by GilboaAWS · Pull Request #40 · ikolomi/valkey

GilboaAWS · 2026-06-18T13:26:08Z

Summary

Training-subsystem observability + integration tests for the S1 dictionary-lifecycle track. Rebased onto unstable (now includes #42 double-count fix and #39 observability counters).

1. Training metrics (INFO compression)

Additive to the last_duration_ms / last_sample_count fields wired by #39:

Field	Type	Purpose
`compression_training_state`	string	idle / scanning / submitted / cooldown
`compression_training_scans_started`	counter	Scans initiated
`compression_training_successes`	counter	Successful dict promotions
`compression_training_failures`	counter	Aborts (insufficient samples) + ZSTD errors
`compression_training_cooldown_until_ms`	gauge	Cooldown expiry timestamp (0 = not cooling)
`compression_training_current_db`	gauge	DB the scan is currently on
`compression_training_dicts_retired`	counter	Dicts that completed the full lifecycle (freed by GC)

A periodic compressionRegistryTryGc() in compressionCron (already added by #39) keeps dicts_retired / known_dicts honest after a drain.

2. Integration tests

tests/integration/compression-training.tcl — real training machinery, no pre-imported dicts. Own start_server blocks, external:skip, skipped under BUILD_ZSTD=no.

First-training trigger fires and produces a usable dict
After training, new writes get compressed (round-trip)
Training does NOT fire below min keys
Training aborts + cooldown on ineligible values
Training scans across multiple databases
Dict retirement lifecycle — compress 20 keys → flip to decompression → read all → dict GC'd

The retirement test is FLUSHALL-free: 200 training keys are SET before the dict exists (stay RAW, hold no refs); only the 20 keys SET after the dict is active get compressed. It asserts compression_compressed_objects == 20 (the regression guard for #42's double-count fix), then == 0 after reads, then dicts_retired increments.

Note on the frame-ref signal: uses the existing compression_compressed_objects counter rather than a per-dict gauge. Each compressed robj holds exactly one dict frame-ref (createCompressedObject +1 / freeCompressedObject+releaseCompressedBuffer −1), so the counter is the global frame-ref total — well-defined across multiple retiring dicts. (An earlier revision had a debug_frame_refs field that only read dicts[0] — removed as misleading.)

Testing

./runtest --single integration/compression-training — 6/6 pass
make test-unit BUILD_ZSTD=yes — 392/392 pass
Clean build with BUILD_ZSTD={yes,no}, no duplicate INFO fields.

compressionInstall took a second dict frame-ref via job->dict_id on top of the IncRef that createCompressedObject already takes (reading dict_id from the frame header's alg_meta). The matching DecRef in freeCompressedObject / releaseCompressedBuffer only fires once, so every compressed value leaked one ref: +2 on install, -1 on free/decompress. The leaked ref kept the dict's frame_refs > 0 forever, so a retiring dict could never satisfy canFree() and was never reclaimed by GC. Root cause: an early design checklist listed an explicit install-time compressionRegistryIncRef as a step. That predated PR #8 moving the IncRef into createCompressedObject. When the real install path landed (S2.7) it followed the stale checklist and added the now-redundant second ref. Fix: remove the redundant IncRef. The object-lifecycle model is the single source of truth — createCompressedObject (sole producer of compressed robjs) takes the ref, freeCompressedObject / releaseCompressedBuffer release it, all keyed on the header alg_meta. Verified by the dict-retirement integration test (PR #40, stacked on this): 20 compressed keys now produce exactly 20 frame_refs (was 40), and the dict is reclaimed after all values decompress.

Adds training-subsystem observability to INFO compression and integration tests exercising the full training + retirement lifecycle. Metrics (additive to the last_duration_ms / last_sample_count fields wired by #39): compression_training_state idle/scanning/submitted/cooldown compression_training_scans_started total scans initiated compression_training_successes successful dict promotions compression_training_failures aborts + ZSTD errors compression_training_cooldown_until_ms compression_training_current_db compression_training_dicts_retired dicts that completed lifecycle (GC'd) compression_training_debug_frame_refs active retiring dict frame_refs Integration tests (tests/integration/compression-training.tcl): - First-training trigger fires and produces a usable dict - After training, new writes get compressed - Training does NOT fire below min keys - Training aborts + cooldown on ineligible values - Training scans across multiple databases - Dict retirement: compress 20 keys -> decompression mode -> read all -> dict GC'd (asserts frame_refs == 20, the regression guard for the double-count fix in #42) The retirement test is FLUSHALL-free: 200 training keys are SET before the dict exists (stay RAW, no refs); only 20 keys SET after the dict is active get compressed. Depends on #42 (install-path double-count fix), now merged to unstable.

The first-training test had a confusing assertion: assert {$dict_after != $dict_before || $dict_before > 0} The '|| dict_before > 0' clause made it vacuously true whenever a dict already existed — the opposite of the intent. It was leftover from shared-server (external) handling, but the test is external:skip and always runs on a fresh instance where dict_before == 0. Replaced with a direct fresh-instance check (active_dict_id == 0 before, > 0 after). Also dropped an unused dict_before capture in the 'does NOT fire' test (only known_dicts is asserted there).

The old order trained a dict in test 1, leaving an active dict for the rest of the shared-instance block. That made the negative-trigger tests (below-min, ineligible-abort) pass vacuously: with a dict already active, first-training can never fire regardless of key count or eligibility (the trigger is gated on !active). Reordered so all no-active-dict checks run first, then training fires once: 1. Below min keys -> no trigger (asserts scans_started == 0) 2. Enough keys but none eligible -> scan fires + aborts (scans_started == 1, failures == 1) — metric-driven, not timing 3. First-training across db0+db1 -> dict promoted (merges the old multi-DB test; only one first-training can fire per active dict) 4. After training -> new writes compressed 5. Active dict blocks a new first-training (new test for the 'no retrain while a dict is active' invariant) Test 2 arms the 30s cooldown; test 3 uses a 60s wait_for_trained_dict to absorb it. Negative tests now assert on scans_started / failures / successes counters instead of sleeping and hoping.

ikolomi

LGTM

github-actions Bot assigned GilboaAWS Jun 18, 2026

GilboaAWS force-pushed the gilboa/s1-training-metrics-and-integ-tests branch 4 times, most recently from a9394f7 to dbf9385 Compare June 21, 2026 12:24

GilboaAWS mentioned this pull request Jun 21, 2026

[S2] Fix install-path frame-ref double-count #42

Merged

GilboaAWS force-pushed the gilboa/s1-training-metrics-and-integ-tests branch from dbf9385 to 2241fa8 Compare June 21, 2026 12:30

GilboaAWS changed the base branch from unstable to gilboa/fix-install-double-count June 21, 2026 12:30

GilboaAWS changed the base branch from gilboa/fix-install-double-count to unstable June 21, 2026 12:52

GilboaAWS force-pushed the gilboa/s1-training-metrics-and-integ-tests branch from 2241fa8 to f2d8cde Compare June 21, 2026 12:52

GilboaAWS force-pushed the gilboa/s1-training-metrics-and-integ-tests branch from f2d8cde to b4d79f2 Compare June 21, 2026 13:37

GilboaAWS added 2 commits June 22, 2026 11:17

GilboaAWS marked this pull request as ready for review June 22, 2026 13:05

GilboaAWS requested a review from ikolomi June 23, 2026 08:45

ikolomi approved these changes Jun 23, 2026

View reviewed changes

ikolomi merged commit 21d7384 into unstable Jun 23, 2026
80 checks passed

GilboaAWS deleted the gilboa/s1-training-metrics-and-integ-tests branch June 23, 2026 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[S1/S4] Training metrics (INFO compression) + integration tests#40

[S1/S4] Training metrics (INFO compression) + integration tests#40
ikolomi merged 3 commits into
unstablefrom
gilboa/s1-training-metrics-and-integ-tests

GilboaAWS commented Jun 18, 2026 •

edited

Loading

Uh oh!

ikolomi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GilboaAWS commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Training metrics (INFO compression)

2. Integration tests

Testing

Uh oh!

ikolomi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GilboaAWS commented Jun 18, 2026 •

edited

Loading