Skip to content

CPU scheduling for workloads#77595

Merged
serxa merged 58 commits intomasterfrom
cpu-scheduler
Apr 6, 2025
Merged

CPU scheduling for workloads#77595
serxa merged 58 commits intomasterfrom
cpu-scheduler

Conversation

@serxa
Copy link
Copy Markdown
Member

@serxa serxa commented Mar 13, 2025

This PR implements concurrency control on top of the resource scheduler and introduces a new type of RESOURCE and a new WORKLOAD setting to limit the number of concurrent threads for a specific workload.

Usage example:

CREATE RESOURCE cpu (MASTER THREAD, WORKER THREAD)
CREATE WORKLOAD all
CREATE WORKLOAD admin IN all SETTINGS max_concurrent_threads = 10
CREATE WORKLOAD production IN all SETTINGS max_concurrent_threads = 100
CREATE WORKLOAD analytics IN production SETTINGS max_concurrent_threads = 60, weight = 9
CREATE WORKLOAD ingestion IN production

This configuration example provides independent CPU slot pools for admin and production. The production pool is shared between analytics and ingestion. Furthermore, if the production pool is overloaded, 9 of 10 released slots will be rescheduled to analytical queries if necessary. The ingestion queries would only receive 1 of 10 slots during overload periods. This might improve the latency of user-facing queries. Analytics has its own limit of 60 concurrent thread, always leaving at least 40 threads to support ingestion. When there is no overload, ingestion could use all 100 threads.

WARNING: Slot scheduling provides a way to control query concurrency but does not guarantee fair CPU time allocation yet. This requires further development of CPU slot preemption and will be supported in the follow-up PR.

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added CPU slot scheduling for workloads, see https://clickhouse.com/docs/operations/workload-scheduling#cpu_scheduling for details

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 13, 2025

Workflow [PR], commit [cb60d51]

@clickhouse-gh clickhouse-gh bot added the pr-feature Pull request with new product feature label Mar 13, 2025
@serxa serxa marked this pull request as ready for review March 20, 2025 13:49
Co-authored-by: Bharat Nallan <bharatnc@gmail.com>
@alexkats alexkats self-assigned this Mar 20, 2025
@alexey-milovidov alexey-milovidov mentioned this pull request Mar 21, 2025
76 tasks
@serxa
Copy link
Copy Markdown
Member Author

serxa commented Mar 27, 2025

Stateless tests (tsan, s3 storage, 3/3) — Server died, fail: 13, passed: 692, skipped: 13

#77600

@serxa
Copy link
Copy Markdown
Member Author

serxa commented Mar 27, 2025

Stateless tests (debug, s3 storage) — fail: 1, passed: 7459, skipped: 326

#78366

@serxa serxa mentioned this pull request Mar 28, 2025
1 task
Copy link
Copy Markdown
Contributor

@alexkats alexkats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@serxa serxa enabled auto-merge April 6, 2025 16:01
@serxa serxa added this pull request to the merge queue Apr 6, 2025
Merged via the queue into master with commit fc16f72 Apr 6, 2025
115 of 120 checks passed
@serxa serxa deleted the cpu-scheduler branch April 6, 2025 18:41
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 6, 2025
@serxa serxa mentioned this pull request Aug 3, 2025
29 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants