Skip to content

Cumulative metrics for active I/O threads usage#2463

Merged
zuiderkwast merged 15 commits into
valkey-io:unstablefrom
deepakrn:unstable
Jan 27, 2026
Merged

Cumulative metrics for active I/O threads usage#2463
zuiderkwast merged 15 commits into
valkey-io:unstablefrom
deepakrn:unstable

Conversation

@deepakrn

@deepakrn deepakrn commented Aug 10, 2025

Copy link
Copy Markdown
Contributor

I/O threads when active do busy polling for work. This puts the CPU time near 100% per thread even when they aren't fully utilized. Tracking the time an IO thread spends doing useful work vs waiting for work helps understand the utilization of individual threads.

INFO fields:

used_active_time_io_thread_%d

... where %d is the IO thread ID number (between 1 and the number of I/O threads).

To get a percentage over a period of time, compare the delta of this metric against a delta of the server's uptime or the INFO field server_time_usec.

@deepakrn

Copy link
Copy Markdown
Contributor Author

This change aims to address #2065 except for tracking the main_thread utilization.

@madolson madolson requested a review from ranshid August 12, 2025 19:45
@codecov

codecov Bot commented Aug 12, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.77%. Comparing base (0b1f99d) to head (b6822b1).
⚠️ Report is 20 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2463      +/-   ##
============================================
+ Coverage     74.33%   74.77%   +0.44%     
============================================
  Files           129      129              
  Lines         71009    71137     +128     
============================================
+ Hits          52784    53193     +409     
+ Misses        18225    17944     -281     
Files with missing lines Coverage Δ
src/io_threads.c 81.98% <100.00%> (+46.41%) ⬆️
src/server.c 89.52% <100.00%> (+0.04%) ⬆️

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zuiderkwast

zuiderkwast commented Sep 5, 2025

Copy link
Copy Markdown
Contributor

@deepakrn Good, I hope we can also find a way to track main thread usefulness.

Can you link the PR to the issue where this was discussed?

@deepakrn deepakrn force-pushed the unstable branch 2 times, most recently from a6ed7ed to 19154c6 Compare September 15, 2025 01:52
Comment thread src/io_threads.c Outdated
Comment thread src/io_threads.h Outdated
Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>
Comment thread src/io_threads.h Outdated
@madolson madolson moved this to Todo in Valkey 9.1 Nov 10, 2025
@dvkashapov

Copy link
Copy Markdown
Member

Do we have any concerns against this approach? The only concern for me is that we would print times for all io-threads, should we cap number of printed thread times to 16 for example?

@deepakrn

Copy link
Copy Markdown
Contributor Author

Thanks for the question @dvkashapov.

See #2065 (comment) for relevant discussion. Since setups with thread count > 16 are rare, we thought we could start without that optimization.

@zuiderkwast

zuiderkwast commented Jan 13, 2026

Copy link
Copy Markdown
Contributor

@deepakrn Finally, we have made a decision about these INFO fields. Sorry for the delay. What we decided is written in the issue #2065:

  • Use field names used_cpu_active_io_thread_1, _2, etc.
  • Don't introduce the thread uptime field. (Users can just compare the active CPU time between two different INFO calls.)
  • Make sure that we don't reset the number to zero if the number of IO threads decrease and then increase again when the configured number of I/O threads is changed during runtime. (We should add a test case for this scenario.)

Do you want to update this PR to match the above?

Now, we also have the implementation for the main thread CPU utilization #2931 by Daniil, so I'm excited to get both of these PRs released in 9.1.

@deepakrn

Copy link
Copy Markdown
Contributor Author

Hi @zuiderkwast , Thanks for the update and the idea makes sense to me. I'll work on making these changes.

@deepakrn

Copy link
Copy Markdown
Contributor Author

@zuiderkwast - couple of follow ups.

Don't introduce the thread uptime field. (Users can just compare the active CPU time between two different INFO calls.)

If we are getting rid of thread uptime field, there wouldn't be an easy way for the user to know if the thread was running but was busy looping waiting for work? Wouldn't that be a gap in observability? The user could still track total time elapsed and look at delta of used_active_cpu_time but that wouldn't tell for how long the IO thread was enabled in that duration?

Make sure that we don't reset the number to zero if the number of IO threads decrease and then increase again when the configured number of I/O threads is changed during runtime. (We should add a test case for this scenario.)

What shouldn't be reset to zero? Are you referring to the active cpu time? If yes, isn't that already the case? Also, I am assuming that INFO should display the active CPU time of all IO threads with value > 0?

@dvkashapov

Copy link
Copy Markdown
Member

The user could still track total time elapsed and look at delta of used_active_cpu_time but that wouldn't tell for how long the IO thread was enabled in that duration?

From relevant issue:

Even without it, the caller can compare the the delta between two points in time and use this delta to compare the I/O thread's active time delta between two INFO calls. To calculate the exact time between two calls to INFO, the delta of the INFO field for the server's uptime can be used.

What shouldn't be reset to zero? Are you referring to the active cpu time? If yes, isn't that already the case?

Looks like you're right about not resetting cpu time currently. The case we want to make sure works is not resetting metric when io thread becomes active->not active->active again.

@zuiderkwast

Copy link
Copy Markdown
Contributor

If we are getting rid of thread uptime field, there wouldn't be an easy way for the user to know if the thread was running but was busy looping waiting for work? Wouldn't that be a gap in observability? The user could still track total time elapsed and look at delta of used_active_cpu_time but that wouldn't tell for how long the IO thread was enabled in that duration?

I don't think it's important to know whether the thread was disabled or just waiting during some time.

To get a percentage between two points in time, you can for example take the delta of the active CPU time for an IO thread and divide by the delta of the server_time_usec INFO field.

What shouldn't be reset to zero? Are you referring to the active cpu time? If yes, isn't that already the case? Also, I am assuming that INFO should display the active CPU time of all IO threads with value > 0?

I believe that's already the case yes, but I'm not completely sure. I haven't verified it. If it's not too much work to write a test case for this, it would be nice to have.

INFO should display the active CPU time of the IO threads that are currently configured in the config io-threads config, even if some of them report zero. If we have had more I/O threads configured in the past, these threads will not run and will not change their CPU time. We don't need to report them in INFO every time. A server can run for years.

Comment thread src/io_threads.c Outdated
@zuiderkwast

zuiderkwast commented Jan 15, 2026

Copy link
Copy Markdown
Contributor

Update.

I have been discussing back and forth with various people weather these metrics should measure the CPU-time during active work or just monotonic time of active work.

Currently, the preference is to use the monotonic clock, not the CPU-time clock. The reason is that a low CPU usage doesn't necessarily mean that the thread has spare capacity to handle more work.

If we instead measure monotonic time and see that 90% of the time, the thread is doing work, it means that the remaining 10%, the thread is waiting for more work. This is better information, because the purpose of these metrics is to be able to tell how much more work the I/O threads can handle.

For a longer expanation, see this #2931 (comment).

@deepakrn do you still think CPU-time is better? If there are any more aspects to consider, I'm interested so we can be sure we make the right decision.

If we decide to use the monotonic clock, we shouldn't include "cpu" in the field names, because it can be misleading. We can rename them to used_active_time_io_thread_1.

@deepakrn

deepakrn commented Jan 16, 2026

Copy link
Copy Markdown
Contributor Author

@zuiderkwast - Clarifying my thought process below:

If we do monotonic clock, we won't have visibility into whether the thread was context switched or was actually doing work in the Valkey process.

If we do CPU clock, it tells us the amount of time that was spent running the process excluding blocking syscalls, preemptions etc.

CPU clock time < monotonic clock time < total time

Understanding the metric with an example:

If (CPU_clock_time_delta/total_time) for a io_thread is 80%, it means 80% of the time was spent doing real work in Valkey and 20% comprises - context_swtich, blocking syscall, busy polling.

if (monotic_time_delta/total_time) for a io_thread is 80%, it means 80% comprises of (context_swith + blocking syscall + active work) and 20% comprises primarily of (busy polling + some more context switch potentially).

So, (100 - monotonic_time_delta/total_time) could be closer to the time spent busy polling (and hence potential free capacity) than (100 - cpu_time_delta/total_time).

If my understanding is correct, it seems like monotonic_clock would better tell us about free capacity than cpu_time. WDYT?

@zuiderkwast

Copy link
Copy Markdown
Contributor

Yes, exactly, good explanation. We came to the same conclusion.

  • Active time = time trying to work = (cpu-time + context switch + blocking syscalls)
  • Not active time = time waiting for work

So let's change to getMonotonicUs and the name used_active_time_io_thread_%d. When you have time.

Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>
Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to merge this and just put some minor comments, but I realize now that we don't stop the counter when an IO thread is inactivated. We need to fix that.

While the thread is inactivated, it hangs in pthread_mutex_lock, waiting for the mutex. We need to exclude that from the active time. I don't have an exact idea about how we can do it.

(after thinking)

  1. Maybe it's enough to just set work_start_time = getMonotonicUs(); after pthread_mutex_unlock. Idea 2 is better.

  2. Another idea, if we want to make sure the clock doesn't tick at all when the thread has no work, we can add a condition on the number of jobs where we increment the active time, like

        if (jobs_to_process != 0) {
            atomic_fetch_add_explicit(&used_active_time_io_thread[id],
                                      work_start_time - prev_work_start_time,
                                      memory_order_relaxed);
        }

Intuitively, I think this seems better and Ithen we can also simplify some other code. We don't need to reset the work_start_time in multiple places, for example.

Comment thread src/io_threads.c Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Comment thread src/io_threads.c Outdated
@deepakrn deepakrn force-pushed the unstable branch 2 times, most recently from 71cfbdf to 030e68a Compare January 25, 2026 04:03
Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the behavior seems very good now. It was a long way to get here. Thanks for your patience!

I have only a few minor style suggestions for the test code.

Comment thread tests/unit/io-threads.tcl Outdated
Comment thread tests/unit/io-threads.tcl Outdated
Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>
@zuiderkwast zuiderkwast added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jan 26, 2026
@zuiderkwast zuiderkwast changed the title Instrumenting active and useful CPU time metric for IO threads Instrumenting used active time metric for I/O threads Jan 26, 2026
@github-actions github-actions Bot removed the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jan 26, 2026
@zuiderkwast zuiderkwast changed the title Instrumenting used active time metric for I/O threads Instrumenting used active time metrics for I/O threads Jan 27, 2026
@zuiderkwast zuiderkwast merged commit 4ad7628 into valkey-io:unstable Jan 27, 2026
64 of 65 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in Valkey 9.1 Jan 27, 2026
@zuiderkwast zuiderkwast added release-notes This issue should get a line item in the release notes needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. labels Jan 27, 2026
@zuiderkwast zuiderkwast changed the title Instrumenting used active time metrics for I/O threads Cumulative metrics for active I/O threads usage Jan 27, 2026
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
I/O threads when active do busy polling for work. This puts the CPU time
near 100% per thread even when they aren't fully utilized. Tracking the
time an IO thread spends doing useful work vs waiting for work helps
understand the utilization of individual threads.

INFO fields:

    used_active_time_io_thread_%d

... where %d is the IO thread ID number (between 1 and the number of I/O
threads).

To get a percentage over a period of time, compare the delta of this
metric against a delta of the server's uptime or the INFO field
`server_time_usec`.

---------

Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>
hpatro pushed a commit to hpatro/valkey that referenced this pull request Mar 5, 2026
I/O threads when active do busy polling for work. This puts the CPU time
near 100% per thread even when they aren't fully utilized. Tracking the
time an IO thread spends doing useful work vs waiting for work helps
understand the utilization of individual threads.

INFO fields:

    used_active_time_io_thread_%d

... where %d is the IO thread ID number (between 1 and the number of I/O
threads).

To get a percentage over a period of time, compare the delta of this
metric against a delta of the server's uptime or the INFO field
`server_time_usec`.

---------

Signed-off-by: Deepak Nandihalli <deepak.nandihalli@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
zuiderkwast added a commit to valkey-io/valkey-doc that referenced this pull request May 8, 2026
Documents the following features:

* valkey-io/valkey#2931
* valkey-io/valkey#2463

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants