hashtable: scan no-duplicate guarantee and add HasPassedKey API#3803
Conversation
Document the no-duplicate guarantee for scan when rehashing is not active: entries are never emitted more than once, entries present throughout the scan are guaranteed to be returned, and entries created/destroyed during scan are never returned twice. Add hashtableScanHasPassedKey(ht, key, cursor) which determines whether a key's bucket has already been visited by the scan cursor. Uses reverse-bit-increment ordering to compare bucket positions. The key does not need to exist in the table. When rehashing is active, the result is best-effort. Add comprehensive unit tests: - Full scan no-duplicates on static table - No-duplicates with deletions mid-scan - No-duplicates with insertions mid-scan - Fuzz test with random seed, 50 iterations, random ops - HasPassedKey correctness verification against actual emission - HasPassedKey with cursor=0 (scan complete) - HasPassedKey for non-existent keys - HasPassedKey fuzz with random seed - HasPassedKey graceful behavior during active rehash Signed-off-by: Rain Valentine <rainval@amazon.com>
Remove redundant '(or not active)' parenthetical -- if rehashing isn't active there's nothing to pause, so 'when rehashing is paused' covers both cases from the caller's perspective. Signed-off-by: Rain Valentine <rainval@amazon.com>
Consistent with all other hashtable predicate functions. Signed-off-by: Rain Valentine <rainval@amazon.com>
The 'doesn't crash' check adds nothing beyond what the fuzz tests already cover. Signed-off-by: Rain Valentine <rainval@amazon.com>
Signed-off-by: Rain Valentine <rainval@amazon.com>
Avoid exposing internal bucket concept in the public API documentation. Signed-off-by: Rain Valentine <rainval@amazon.com>
📝 WalkthroughWalkthroughAdds hashtableScanHasPassedKey(), tightens hashtableScan guarantees for scans where shrinking is blocked, implements the helper, and adds unit tests validating scan uniqueness, cursor semantics, mutation interactions, and fuzz scenarios. ChangesHashtable scan cursor API
🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #3803 +/- ##
============================================
+ Coverage 76.62% 76.74% +0.12%
============================================
Files 162 162
Lines 80674 80939 +265
============================================
+ Hits 61813 62115 +302
+ Misses 18861 18824 -37
🚀 New features to boost your workflow:
|
Signed-off-by: Rain Valentine <rainval@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/hashtable.c (1)
2048-2048: ⚡ Quick winUse a boolean literal in the bool return path
Line 2048 returns
1from aboolAPI. Prefertruefor consistency with the C boolean convention in this repo.Proposed patch
bool hashtableScanHasPassedKey(hashtable *ht, const void *key, size_t cursor) { - if (cursor == 0) return 1; + if (cursor == 0) return true; size_t mask = expToMask(ht->bucket_exp[0]);As per coding guidelines, "Use the boolean type for true/false values in C code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/hashtable.c` at line 2048, The function currently returns the integer literal 1 for a bool API; replace that integer literal with the boolean literal true so the return path uses the bool type consistently (change the "return 1;" after the cursor check to "return true;"), referencing the same cursor variable and return statement in this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/unit/test_hashtable.cpp`:
- Around line 1418-1420: The test setup loops call hashtableAdd(ht, (void *)j)
but ignore its return value, so failed inserts can silently break test
preconditions; update each setup loop (where hashtableAdd is used with ht and
(void*)j) to check the return and fail the test on error (e.g., replace the bare
call with an assertion that hashtableAdd returned success such as
ASSERT_TRUE/EXPECT_TRUE or equivalent test-framework failure) so any insertion
failure aborts the test and makes the precondition explicit.
- Around line 1413-1434: Wrap each test's global state changes and heap
allocations in RAII-style guards so cleanup always runs even if ASSERT_* aborts:
create a small scoped guard that calls
hashtableSetResizePolicy(HASHTABLE_RESIZE_FORBID) in its constructor and
restores hashtableSetResizePolicy(HASHTABLE_RESIZE_ALLOW) in its destructor, and
use another scoped guard that takes the hashtable* (ht) and calls
hashtableRelease(ht) in its destructor; replace the manual tail reset/release in
the test blocks (which call hashtableSetResizePolicy and hashtableRelease) with
these scope guards around the code in functions that include
hashtableSetResizePolicy, hashtableCreate, and hashtableRelease (e.g., the test
around hashtableSetResizePolicy/HASHTABLE_RESIZE_FORBID, hashtableCreate,
hashtableRelease/hashtableScan), and apply the same pattern to the other test
ranges noted (1438-1464, 1468-1497, 1505-1548, 1555-1593, 1614-1635, 1643-1687).
---
Nitpick comments:
In `@src/hashtable.c`:
- Line 2048: The function currently returns the integer literal 1 for a bool
API; replace that integer literal with the boolean literal true so the return
path uses the bool type consistently (change the "return 1;" after the cursor
check to "return true;"), referencing the same cursor variable and return
statement in this function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 35e763d5-9662-410c-9320-89f0a94d1fc5
📒 Files selected for processing (3)
src/hashtable.csrc/hashtable.hsrc/unit/test_hashtable.cpp
Signed-off-by: Rain Valentine <rainval@amazon.com>
| * - An entry that is created, destroyed, or re-created during the scan may or | ||
| * may not be returned, but will never be returned more than once. |
There was a problem hiding this comment.
Is it possible that key that is deleted and other key taking its spot can return more than once upon re-creation https://github.com/valkey-io/valkey/blob/unstable/src/hashtable.c#L1080-L1082 as we find lowest unused position?
There was a problem hiding this comment.
No. 🙂 As long as rehashing was paused before the scan started and remains paused until the scan finishes, this cannot occur.
When rehashing incrementally the hashtable typically doubles or halves in size so 2 indices in one table correspond to 1 index in the other table, or vice versa. Scan will use the index increment of the table with larger buckets, so it would scan 2 indices of the larger table and 1 index of the smaller table in one scan batch, then stop. Removing an item and adding a new one with the same key can only move it between that group of 3 indices between the two tables, so there is no situation where a key can jump over the cursor position. (This is because our hashtable resolves collisions with chaining, not open indexing.)
The reason we have to pause rehashing is that when we're not rehashing it goes one table index at a time. If rehashing begins between scan calls then the cursor might be left between the two indices of the larger table - i.e. in the middle of one of these groups of three, in which case when we continue scanning we can't be strictly sure which in that group have been covered and which haven't. (We don't actually have to pause rehashing - the key thing to avoid is to not start a rehash! But "pause rehashing" is a much simpler contract to work with.)
There was a problem hiding this comment.
Ah, somehow missed you were talking about filling in holes in the bucket chain. That isn't an issue because scan goes index-by-index, i.e. one bucket chain at a time. The code you're looking at does mean that an item could "move" to a different position within a bucket chain, but it will still be at the same table index. (or in the same group of 3 if a rehash is in progress, as I mentioned before.)
There was a problem hiding this comment.
Thanks @rainsupreme explanation makes sense. I was thinking about movement within the same bucket chain from the generic hashtable API perspective, but I agree that this is not how the current code paths use the scan callback, and your explanation covers the table index/cursor guarantee being documented here.
Cursor 0 is ambiguous (start or end of scan), but the caller always
knows which state they're in. Returning false ('not started') is the
safer default -- the caller handles 'scan complete' themselves.
Signed-off-by: Rain Valentine <rainval@amazon.com>
- Extract populateSequential helper (DRY) - Remove redundant hashtableExpand calls (no-op with FORBID) - Remove redundant ASSERT_FALSE(hashtableIsRehashing) assertions - Simplify deletion pattern to random per-step - Fix fuzz delete to target full range including new inserts - Remove manual resize policy restore (SetUp handles it) Signed-off-by: Rain Valentine <rainval@amazon.com>
The no-duplicate guarantee requires that no shrink is initiated during the scan. This is already ensured by hashtablePauseRehashing (called internally by scan) which also pauses auto-shrink. Expansion and rehashing progress are safe: the scan cursor step size is based on the smallest table and remains valid when a larger table is introduced. Only a shrink is problematic because it introduces a smaller table whose bucket boundaries don't align with the existing cursor. Add a test verifying that shrink is deferred during a scan callback even when the fill factor drops below the threshold. Signed-off-by: Rain Valentine <rainval@amazon.com>
Tests now match the documented contract precisely: block shrinking (not all resizing) for the duration of the scan. Expansion and rehashing are allowed to occur naturally during the tests. Signed-off-by: Rain Valentine <rainval@amazon.com>
Set a good example for anyone using these tests as reference for the pause/resume pattern. Signed-off-by: Rain Valentine <rainval@amazon.com>
- Add completeRehashing helper; ensure table is static before scan starts - Move PauseAutoShrink/ResumeAutoShrink to tightly scope the scan loop - Replace scan_has_passed_key_nonexistent with scan_has_passed_key_deleted: scan partway, pick one emitted and one non-emitted key, delete both, verify HasPassedKey still reflects their original positions - Fix expand test: single initial scan step (no dead cursor==0 branch), use larger table (5000) to avoid premature scan completion, drive rehash to completion during scan and assert it finished Signed-off-by: Rain Valentine <rainval@amazon.com>
Simplify the doc to one clear statement: exact when shrinking is blocked, best-effort if a shrink has been initiated since the cursor was obtained. Removes the confusing 'rehashing active' vs 'shrinking blocked' apparent contradiction. Also use EXPECT_TRUE in populateSequential helper (correct gtest pattern for non-test functions). Signed-off-by: Rain Valentine <rainval@amazon.com>
Assert that rehashing finishes while the scan loop is still in progress, proving the full expand lifecycle (start, progress, complete) is safe mid-scan. Signed-off-by: Rain Valentine <rainval@amazon.com>
476ebe0 to
d91db1e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/unit/test_hashtable.cpp`:
- Around line 1419-1421: The while loops that wait for rehash (using
hashtableIsRehashing and hashtableRehashMicroseconds) are unbounded and must be
guarded: introduce a fail-fast counter/timeout (e.g. max_iters or remaining_us)
initialized before the loop, decrement it on each iteration or subtract the
waited microseconds, and assert/fail with a clear message if the counter reaches
zero so the test fails fast and shows the rehash did not complete; apply this
change to the loop using hashtableIsRehashing/hashtableRehashMicroseconds at
both locations referenced (around lines ~1419 and ~1738) so the test cannot hang
indefinitely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f3113ab0-a5f6-4859-a1da-ad0bbfc85cd5
📒 Files selected for processing (2)
src/hashtable.csrc/unit/test_hashtable.cpp
| while (hashtableIsRehashing(ht)) { | ||
| hashtableRehashMicroseconds(ht, 1000); | ||
| } |
There was a problem hiding this comment.
Add fail-fast bounds to rehash wait loops.
Both loops are unbounded and can hang the test process indefinitely if rehash never progresses/starts. Add a guard counter with assertions so failures are diagnosable instead of timing out CI.
Proposed fix
static void completeRehashing(hashtable *ht) {
+ size_t guard = 0;
while (hashtableIsRehashing(ht)) {
hashtableRehashMicroseconds(ht, 1000);
+ ASSERT_LT(++guard, 100000u) << "Rehash did not complete";
}
}
...
/* Trigger expansion by adding entries until rehashing starts. */
long next = count + 1;
+ size_t start_rehash_guard = 0;
while (!hashtableIsRehashing(ht)) {
- hashtableAdd(ht, (void *)next++);
+ ASSERT_TRUE(hashtableAdd(ht, (void *)next++));
+ ASSERT_LT(++start_rehash_guard, 100000u)
+ << "Expansion rehash did not start";
}Also applies to: 1738-1740
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/unit/test_hashtable.cpp` around lines 1419 - 1421, The while loops that
wait for rehash (using hashtableIsRehashing and hashtableRehashMicroseconds) are
unbounded and must be guarded: introduce a fail-fast counter/timeout (e.g.
max_iters or remaining_us) initialized before the loop, decrement it on each
iteration or subtract the waited microseconds, and assert/fail with a clear
message if the counter reaches zero so the test fails fast and shows the rehash
did not complete; apply this change to the loop using
hashtableIsRehashing/hashtableRehashMicroseconds at both locations referenced
(around lines ~1419 and ~1738) so the test cannot hang indefinitely.
Description:
When shrinking is blocked for the duration of a scan (via hashtablePauseAutoShrink), the hashtable scan guarantees no entry is emitted more than once. Expansion and rehashing may occur freely — only a shrink is problematic because it introduces a smaller table whose bucket boundaries might not align with the existing cursor position. This was already true but undocumented.
New API:
hashtableScanHasPassedKey(ht, key, cursor)Given a scan cursor, determines whether a key's hashtable position has already been visited. This enables callers to definitively answer: "would this key have been returned already?" — useful for background iteration that needs to track which keys have been seen without maintaining a separate visited-set. The key does not need to exist in the table (position is computed from its hash). The result is exact when shrinking is blocked; best-effort otherwise.
Motivated by #2878 (forkless save, to be used by bgiterator).
Tests: 10 unit tests including fuzz tests with random seeds, a test proving expansion + full rehash completion mid-scan produces no duplicates, and a test verifying shrink is correctly deferred during scan callbacks.