-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Fix resize hash table dictionary iterator #12660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@oranagra I've removed the assertions around rehashing (it's causing the flakiness). The remaining validation still covers the expiration skip logic if there are too many empty buckets and then upon resize it gets cleaned up. |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels a bit odd that this test talks so much about rehashing, but doesn't validate any of that.
however, i suppose it still covers it's purpose, and any rehashing checks can add coupling with dict internals.
it looks like the test doesn't depend on rehashing, but rather the luck of rehashing, so maybe that's ok.
i'm gonna merge it now since the CI failures are painful. we can iterate over it in a follow up PR if we want.
|
looks like this test still has issues: |
|
still fails with valgrind, even with my increased timeout. |
|
@hpatro did you look into the new failures of the defrag tests? |
Yes, I'm looking at it. |
|
@madolson Could you please trigger a daily run on this branch ? |
|
Thanks @madolson for the re-run. Recent most run with increased timeout: Valgrind test also succeeded. https://github.com/redis/redis/actions/runs/6567848801 @oranagra Please have a look. |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it wasn't just a test issue..
please have a look to make sure other places with similar pattern don't have this issue.
thanks,
|
i see this test still fails in freebsd CI (which is notoriously slow) https://github.com/redis/redis/actions/runs/6607001072/job/17943822780#step:4:8901 |
|
@oranagra Taking a look. |
|
@oranagra The thought which I've from reading the code is https://github.com/redis/redis/blob/unstable/tests/unit/expire.tcl#L858-L861 If the above delete operation(s) takes longer than 500ms (TTL set on the keys), empty buckets wouldn't have been formed and the large no. of empty bucket skip logic won't be valid and all the keys would get expired as part of the expiry scan. Does freebsd runs on smaller/slower instance due to some reason? |
|
i see the test sometimes fails on valgrind |
|
@oranagra Let me take a look if there are any other potential issue or else will submit a PR with an increased wait. |
|
FYI: it keeps failing on valgrind |
Dictionary iterator logic in the `tryResizeHashTables` method is picking the next (incorrect) dictionary while the cursor is at a given slot. This could lead to some dictionary/slot getting skipped from resizing. Also stabilize the test. problem introduced recently in redis#11695
Dictionary iterator logic in the
tryResizeHashTablesmethod is picking the next (incorrect) dictionary while the cursor is at a given slot. This could lead to some dictionary/slot getting skipped from resizing.Saw failure in recent run: d27c741
On freebsd:
https://github.com/redis/redis/actions/runs/6540680776/job/17761013478
https://pipelinesghubeus22.actions.githubusercontent.com/2oDd4EuUudJqGKlOAB2KXZpKHTtseqbUa63unZUQGqUSgXNthI/_apis/pipelines/1/runs/40953/signedlogcontent/14?urlExpires=2023-10-17T02%3A07%3A01.6196538Z&urlSigningMethod=HMACV1&urlSignature=yOYvJogzR%2FbpT02XI2vr2Zc0DGBprgft3zKjtIIpfAk%3D
problem introduced recently in #11695