Skip to content

Allow dynamic modification of io-threads num#2033

Merged
madolson merged 1 commit into
valkey-io:unstablefrom
ayush933:dynamicThreads
May 28, 2025
Merged

Allow dynamic modification of io-threads num#2033
madolson merged 1 commit into
valkey-io:unstablefrom
ayush933:dynamicThreads

Conversation

@ayush933

@ayush933 ayush933 commented Apr 30, 2025

Copy link
Copy Markdown
Contributor

Item from #761

This PR has the following changes

  1. Bug fix where calling pthread_join() from main thread for an IO thread would hang indefinitely. This is because IOThreadMain() doesn't have a cancellation point.So pthread_cancel() from main thread is not honored.
    Can be reproed by calling shutdownIOThread() from the main thread for any active thread with empty job queue.
    Fixed by adding cancellation point in IOThreadMain().
  2. Makes io-threads config runtime modifiable.

@ayush933

Copy link
Copy Markdown
Contributor Author

UTs pending atm.

@codecov

codecov Bot commented Apr 30, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.29630% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 71.37%. Comparing base (af25a52) to head (3f54996).
⚠️ Report is 639 commits behind head on unstable.

Files with missing lines Patch % Lines
src/io_threads.c 96.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2033      +/-   ##
============================================
+ Coverage     71.02%   71.37%   +0.34%     
============================================
  Files           122      122              
  Lines         66171    66050     -121     
============================================
+ Hits          46999    47142     +143     
+ Misses        19172    18908     -264     
Files with missing lines Coverage Δ
src/config.c 78.52% <ø> (+0.12%) ⬆️
src/networking.c 87.44% <100.00%> (+0.26%) ⬆️
src/server.h 100.00% <ø> (ø)
src/io_threads.c 34.70% <96.00%> (+27.89%) ⬆️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ayush933 ayush933 marked this pull request as ready for review April 30, 2025 20:15

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Just minor things, looking forward to the tests.

Comment thread src/io_threads.c Outdated
Comment thread src/io_threads.c Outdated
Comment thread src/io_threads.c Outdated
Comment thread src/io_threads.c Outdated
Comment thread src/config.c
@ayush933

ayush933 commented May 1, 2025

Copy link
Copy Markdown
Contributor Author

Hey @uriyage, just curious , can you please explain the safety concerns a bit more i.e. why we would want to drain all jobs and inactivate all threads instead of the min num required?

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 451df91 to cde1b5a Compare May 1, 2025 15:24
@uriyage

uriyage commented May 1, 2025

Copy link
Copy Markdown
Contributor

Hey @uriyage, just curious , can you please explain the safety concerns a bit more i.e. why we would want to drain all jobs and inactivate all threads instead of the min num required?

@ayush933 I don't recall the exact workflow, but I remember attempting something similar, and it affected the logic of dynamically adjusting the threads or the client cur_tid logic. I might be mistaken, but I think it will be cleaner and safer.

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 40b1d3f to c450698 Compare May 1, 2025 19:10
@ayush933 ayush933 requested review from madolson and uriyage May 1, 2025 19:13
Comment thread tests/unit/other.tcl
@ayush933 ayush933 requested a review from uriyage May 5, 2025 14:33
@ayush933

ayush933 commented May 8, 2025

Copy link
Copy Markdown
Contributor Author

Hey @uriyage @madolson , anything else needed to move this PR forward?

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits, but still would like @uriyage to take sign off as well, he knows more about this code.

Comment thread src/server.h Outdated
Comment thread tests/unit/other.tcl Outdated
Comment thread tests/unit/other.tcl Outdated
Comment thread tests/unit/other.tcl Outdated
Comment thread src/io_threads.c Outdated
@uriyage

uriyage commented May 11, 2025

Copy link
Copy Markdown
Contributor

Just some nits, but still would like @uriyage to take sign off as well, he knows more about this code.

LGTM. I left one small comment about the log verbosity

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 500689b to 5698821 Compare May 11, 2025 15:16
Comment thread tests/unit/other.tcl
Comment thread src/server.h Outdated
@zuiderkwast zuiderkwast added release-notes This issue should get a line item in the release notes to-be-merged Almost ready to merge labels May 16, 2025
Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually approve. I didn't review the code (just looked briefly).

Bug fix where calling pthread_join() from main thread for an IO thread would hang indefinitely

What is the severity of this bug? Which scenario triggers it, from a user's perspective? Do we need to backport the bugfix? (If we need that, may need to split out the bugfix to a separate commit that we can backport, since this PRs is a new feature + a bugfix. We should not backport the new feature.)

@ayush933

Copy link
Copy Markdown
Contributor Author

Before this feature, the only time we called shutdownIoThread() was on server shutdown.
Considering this wasn't noticed until now, I don't think its too severe and needs a backport but lets wait for @uriyage's input on this.

@uriyage

uriyage commented May 19, 2025

Copy link
Copy Markdown
Contributor

Before this feature, the only time we called shutdownIoThread() was on server shutdown. Considering this wasn't noticed until now, I don't think its too severe and needs a backport but lets wait for @uriyage's input on this.

Currently, it is only used by doFastMemoryTest, which is called from printCrashReport after a crash occurs, in this case the threads are suspended by the ThreadsManager_runOnThreads so not an issue.

Comment thread src/config.c
@madolson madolson merged commit 4aaaf88 into valkey-io:unstable May 28, 2025
@madolson

Copy link
Copy Markdown
Member

Comment thread src/io_threads.c
chzhoo pushed a commit to chzhoo/valkey that referenced this pull request Jun 12, 2025
Item from valkey-io#761

This PR has the following changes

1. Bug fix where calling `pthread_join()` from main thread for an IO
thread would hang indefinitely. This is because `IOThreadMain()` doesn't
have a cancellation point.So `pthread_cancel()` from main thread is not
honored.
Can be reproed by calling `shutdownIOThread()` from the main thread for
any active thread with empty job queue.
 Fixed by adding cancellation point in `IOThreadMain()`.
2. Makes `io-threads` config runtime modifiable.

Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>
Signed-off-by: chzhoo <czawyx@163.com>
shanwan1 pushed a commit to shanwan1/valkey that referenced this pull request Jun 13, 2025
Item from valkey-io#761

This PR has the following changes

1. Bug fix where calling `pthread_join()` from main thread for an IO
thread would hang indefinitely. This is because `IOThreadMain()` doesn't
have a cancellation point.So `pthread_cancel()` from main thread is not
honored.
Can be reproed by calling `shutdownIOThread()` from the main thread for
any active thread with empty job queue.
 Fixed by adding cancellation point in `IOThreadMain()`.
2. Makes `io-threads` config runtime modifiable.

Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>
Signed-off-by: shanwan1 <shanwan1@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes to-be-merged Almost ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants