Skip to content

perf: skip empty memtable during query scoring#252

Merged
tjgreen42 merged 1 commit intomainfrom
optimize/memtable-search
Mar 3, 2026
Merged

perf: skip empty memtable during query scoring#252
tjgreen42 merged 1 commit intomainfrom
optimize/memtable-search

Conversation

@tjgreen42
Copy link
Copy Markdown
Collaborator

Summary

  • One-line check: return NULL from tp_memtable_source_create when memtable->total_postings == 0
  • Skips the entire memtable scoring path (dshash_attach x2, hash_create, per-term dshash_find, dshash_detach x2)
  • Both single-term and multi-term BMW callers already handle NULL source gracefully

Motivation

After CREATE INDEX, the memtable is empty -- all data lives in segments. But every query still attached to 2 dshash tables, created a doc accumulation hash table, and did per-term hash lookups that always returned NULL. Profiling on 138M MS-MARCO v2 showed this at 16% of CPU.

Test plan

  • All 49 regression tests pass
  • CI passes
  • Benchmark on MS-MARCO v2 to measure latency improvement

When the memtable has no postings (total_postings == 0), return NULL
from tp_memtable_source_create to skip the entire memtable scoring
path. This avoids dshash_attach (x2), hash_create for doc accumulation,
per-term dshash_find lookups, and dshash_detach (x2) -- all of which
were pure overhead after CREATE INDEX since the memtable is empty.

Profiling showed tp_memtable_search at 16% of CPU on 138M rows where
every memtable lookup returned nothing.
@tjgreen42 tjgreen42 merged commit 0bbc8f8 into main Mar 3, 2026
15 checks passed
@tjgreen42 tjgreen42 deleted the optimize/memtable-search branch March 3, 2026 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant