fix(cluster): Remove per-call srand in clusterManagerNodePrimaryRandom#3586
Conversation
This is only called in LRUTestMode, right? https://github.com/valkey-io/valkey/blob/unstable/src/valkey-cli.c#L9832-L9838 Overall, it would make sense to just srand once |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3586 +/- ##
============================================
- Coverage 76.66% 76.58% -0.08%
============================================
Files 160 160
Lines 80472 80469 -3
============================================
- Hits 61690 61625 -65
- Misses 18782 18844 +62
🚀 New features to boost your workflow:
|
|
Thanks for the feedback @murphyjacob4 . Addressed the issue and made a single srand call now. this should not have the DCO issue either. |
clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. Signed-off-by: Abhishek Mathur <matshek@amazon.com>
Move the PRNG seeding to a single srand(time(NULL) ^ getpid()) call in main(), removing redundant per-function srand() calls from clusterManagerOptimizeAntiAffinity(), pipeMode(), and LRUTestMode(). This ensures consistent entropy (pid mixing) across all code paths and prevents any future tight-loop reseeding bugs like the one fixed in clusterManagerNodePrimaryRandom(). Signed-off-by: Abhishek Mathur <matshek@amazon.com>
12f27a9 to
749f882
Compare
rainsupreme
left a comment
There was a problem hiding this comment.
LGTM! Good find on the bug :)
roshkhatri
left a comment
There was a problem hiding this comment.
LGTM,
Nit. can you please update the top comment and PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838), we can change it to this approach consolidated all srand calls into main()
valkey-io#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
valkey-io#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
valkey-io#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
valkey-io#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Backport of #3586 to the 9.1 branch. ## Problem `clusterManagerNodePrimaryRandom()` called `srand(time(NULL))` on every invocation, then immediately `rand() % primary_count`. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. The same issue existed in `clusterManagerOptimizeAntiAffinity`, `pipeMode`, and `LRUTestMode`. ## Solution Remove all per-call `srand()` invocations and consolidate into a single `srand(time(NULL) ^ getpid())` at the start of `main()`. This allows `rand()` to advance its state across calls, distributing uncovered slots randomly across available primaries. (cherry picked from commit 7e2a2f7) Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
#3586) clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node. Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries. --------- Signed-off-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Abhishek Mathur <matshek@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
clusterManagerNodePrimaryRandom() called srand(time(NULL)) on every invocation, then immediately rand() % primary_count. When called in a tight loop for uncovered slots, all calls within the same wall-clock second produce the identical seed, causing every uncovered slot to be assigned to the same primary node.
Remove the srand() call since the PRNG is already seeded at startup (srand(time(NULL) ^ getpid()) at line 9838). This allows rand() to advance its state across calls, distributing uncovered slots randomly across available primaries.