Skip to content

flow-control: fix compaction slowdown caused by increased flow-control thresholds.#18710

Merged
ti-chi-bot[bot] merged 5 commits intotikv:masterfrom
hhwyt:decouple-l0-file-threshold
Sep 22, 2025
Merged

flow-control: fix compaction slowdown caused by increased flow-control thresholds.#18710
ti-chi-bot[bot] merged 5 commits intotikv:masterfrom
hhwyt:decouple-l0-file-threshold

Conversation

@hhwyt
Copy link
Contributor

@hhwyt hhwyt commented Jul 15, 2025

What is changed and how it works?

Issue Number: Close #18708

What's Changed:

This PR addresses performance stability issues caused by increasing
storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit. Previously,
raising these values could reduce the effectiveness of RocksDB’s compaction
speed-up mechanism, because the RocksDB internal thresholds
(level0-slowdown-writes-trigger and soft-pending-compaction-bytes-limit)
would be overridden, delaying compaction acceleration.

Key improvements:
1. Conditional override of RocksDB thresholds:
  - level0-slowdown-writes-trigger is overridden by l0-file-threshold
only if it is smaller.
  - soft-pending-compaction-bytes-limit is overridden only if it is
smaller than storage.flow-control.soft-pending-compaction-bytes-limit.
This ensures that increasing flow-control settings does not weaken
compaction acceleration, while user-configured RocksDB thresholds that
are larger than the flow-control limits are overriden, allowing compaction
speed-up to trigger before write flow control.
flow control.
3. Updated write stall check:
  - ingest_maybe_slowdown_writes now uses level0-stop-writes-trigger
instead of level0-slowdown-writes-trigger to determine whether ingest
may trigger a write stall.
  - This keeps the original behavior, since `l0-file-threshold` overrides
`level0-stop-writes-trigger`, just like the previous behavior with
`level0-slowdown-writes-trigger`. Ideally, flow-control settings would
be used directly to determine write stalls, but
`ingest_maybe_slowdown_writes` cannot access the flow-control module
configuration because this function resides inside the Engine module。

After this change, write control effectively has three stages:
1. Compaction acceleration: triggered when RocksDB thresholds are reached.
2. Slowdown writes: triggered at storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit.
4. Throttle writes: triggered at storage.flow-control.hard-pending-compaction-bytes-limit. 

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
    TODO
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

When storage.flow-control.enable is set to true, we only override rocksdb.level0_slowdown_writes_trigger to l0-file-threshold only when it is larger than l0-file-threshold and only override rocksdb.soft-pend-compaction-bytes-limit to storage.flow-control.soft-pending-compaction-bytes-limit only when it is larger than storage.flow-control.soft-pending-compaction-bytes-limit.

@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has signed the dco. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 15, 2025
@hhwyt hhwyt force-pushed the decouple-l0-file-threshold branch from 7de2145 to 3be7d14 Compare July 16, 2025 07:18
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 17, 2025
@hhwyt hhwyt changed the title flow-control: remove override for level0-slowdown-writes-trigger and … flow-control: fix flow-control override to enhance compaction stability Jul 17, 2025
@hhwyt hhwyt changed the title flow-control: fix flow-control override to enhance compaction stability flow-control: fix compaction slowdown caused by increased flow-control thresholds. Jul 18, 2025
@hhwyt hhwyt requested review from Connor1996 and glorv July 18, 2025 03:36
@hhwyt
Copy link
Contributor Author

hhwyt commented Jul 18, 2025

/retest

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jul 18, 2025
@hhwyt
Copy link
Contributor Author

hhwyt commented Jul 18, 2025

/hold

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 18, 2025
);
$cf_opts.level0_slowdown_writes_trigger = $cfg.l0_files_threshold as i32;
}
// Keep override for level0_stop_writes_trigger to ensure ingest_maybe_stall
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this may highlight their differences?

Suggested change
// Keep override for level0_stop_writes_trigger to ensure ingest_maybe_stall
// If unset, `level0_stop_writes_trigger` defaults to `l0_files_threshold` (unlike `level0_slowdown_writes_trigger`, which defaults to 20) to ensure ingest_maybe_stall

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 18, 2025
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 18, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-07-18 07:02:34.891451316 +0000 UTC m=+2847207.614630284: ☑️ agreed by v01dstar.
  • 2025-07-18 12:31:38.812580063 +0000 UTC m=+10969.814575899: ☑️ agreed by hbisheng.

// Use level0_stop_writes_trigger instead of level0_slowdown_writes_trigger
// to integrate with flow control while preserving RocksDB's compaction speed-up
// mechanism
let stop_trigger = options.get_level_zero_stop_writes_trigger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is confusing, why not use flow_control.l0-threshold directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. flow control parameters should be used here, rocksdb parameters should only be used to control the rocks internal compaction scheduling.

Copy link
Contributor Author

@hhwyt hhwyt Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Connor1996 @glorv
Absolutely, flow-control.l0-file-threshold is a prefect fit here. But the problem is — how can we access the flow-contrl configuration from the engine_rocks module? There isn't a proper way to do that, since flow-control is on the upper storage module.

Ideally, flow-control should be indendent module that can be accessed by both the storage module and engine_rocks, but that's not an easy thing.

I guess that's why the original code uses level0-slowdown-writes-trigger instead of l0-file-threshold, right?

pub target_file_size_base: Option<ReadableSize>,
pub level0_file_num_compaction_trigger: i32,
pub level0_slowdown_writes_trigger: Option<i32>,
pub level0_slowdown_writes_trigger: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only change slowdown but not stop_writes? Is stop_writes still useful when write_stall is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glorv

Why only change slowdown but not stop_writes?

The default stop_writes(36) is larger than l0-file-threshold(20). If we use this default, we cannot identify and log a warning when the user manually sets stop_writes to a larger value, as we did in the previous logic.

Is stop_writes still useful when write_stall is disabled?

No.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Please add a comment on level0_slowdown_writes_trigger as it is useless when disable_write_stall is on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the wrong reply.
When disable_write_stall is on, level0_slowdown_writes_trigger is used for compaction speed-up mechanism, and level0_stop_writes_triiger is used for determine wheter triggering a write stall.

Copy link
Contributor Author

@hhwyt hhwyt Sep 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are already in fill_cf_opt!

@hhwyt
Copy link
Contributor Author

hhwyt commented Sep 20, 2025

/retest

Signed-off-by: hhwyt <hhwyt1@gmail.com>

address comments

Signed-off-by: hhwyt <hhwyt1@gmail.com>

set default values

Signed-off-by: hhwyt <hhwyt1@gmail.com>

flow-control: remove override for level0-slowdown-writes-trigger and soft-pending-compaction-bytes-limit

Signed-off-by: hhwyt <hhwyt1@gmail.com>
Signed-off-by: hhwyt <hhwyt1@gmail.com>
@hhwyt hhwyt force-pushed the decouple-l0-file-threshold branch from 77278e6 to af7a83a Compare September 21, 2025 05:19
Signed-off-by: hhwyt <hhwyt1@gmail.com>
Signed-off-by: hhwyt <hhwyt1@gmail.com>
Signed-off-by: hhwyt <hhwyt1@gmail.com>
@hhwyt
Copy link
Contributor Author

hhwyt commented Sep 21, 2025

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 21, 2025
@hhwyt
Copy link
Contributor Author

hhwyt commented Sep 21, 2025

/cc @hbisheng @glorv need a LGTM, PTAL again, thx~

@glorv
Copy link
Contributor

glorv commented Sep 22, 2025

/cc @zhangjinpeng87 @cfzjywxk PTAL for the config change

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 22, 2025

@glorv: GitHub didn't allow me to request PR reviews from the following users: the, config, change, PTAL, for.

Note that only tikv members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @zhangjinpeng87 @cfzjywxk PTAL for the config change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Member

@zhangjinpeng87 zhangjinpeng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hbisheng, v01dstar, zhangjinpeng87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added approved needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. labels Sep 22, 2025
@ti-chi-bot ti-chi-bot bot merged commit 9c443de into tikv:master Sep 22, 2025
9 checks passed
@ti-chi-bot ti-chi-bot bot added this to the Pool milestone Sep 22, 2025
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #18994.

3AceShowHand pushed a commit to 3AceShowHand/tikv that referenced this pull request Oct 13, 2025
…l thresholds. (tikv#18710)

close tikv#18708

This PR addresses performance stability issues caused by increasing
storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit. Previously,
raising these values could reduce the effectiveness of RocksDB’s compaction
speed-up mechanism, because the RocksDB internal thresholds
(level0-slowdown-writes-trigger and soft-pending-compaction-bytes-limit)
would be overridden, delaying compaction acceleration.

Key improvements:
1. Conditional override of RocksDB thresholds:
  - level0-slowdown-writes-trigger is overridden by l0-file-threshold
only if it is smaller.
  - soft-pending-compaction-bytes-limit is overridden only if it is
smaller than storage.flow-control.soft-pending-compaction-bytes-limit.
This ensures that increasing flow-control settings does not weaken
compaction acceleration, while user-configured RocksDB thresholds that
are larger than the flow-control limits are overriden, allowing compaction
speed-up to trigger before write flow control.
flow control.
3. Updated write stall check:
  - ingest_maybe_slowdown_writes now uses level0-stop-writes-trigger
instead of level0-slowdown-writes-trigger to determine whether ingest
may trigger a write stall.
  - This keeps the original behavior, since `l0-file-threshold` overrides
`level0-stop-writes-trigger`, just like the previous behavior with
`level0-slowdown-writes-trigger`. Ideally, flow-control settings would
be used directly to determine write stalls, but
`ingest_maybe_slowdown_writes` cannot access the flow-control module
configuration because this function resides inside the Engine module。

After this change, write control effectively has three stages:
1. Compaction acceleration: triggered when RocksDB thresholds are reached.
2. Flow control: triggered at storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit.
4. Stop writes: triggered at storage.flow-control.hard-pending-compaction-bytes-limit.

Signed-off-by: hhwyt <hhwyt1@gmail.com>
Signed-off-by: 3AceShowHand <jinl1037@hotmail.com>
@ti-chi-bot ti-chi-bot bot removed the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Dec 3, 2025
ti-chi-bot bot pushed a commit that referenced this pull request Dec 4, 2025
…l thresholds. (#18710) (#18994)

close #18708

This PR addresses performance stability issues caused by increasing
storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit. Previously,
raising these values could reduce the effectiveness of RocksDB’s compaction
speed-up mechanism, because the RocksDB internal thresholds
(level0-slowdown-writes-trigger and soft-pending-compaction-bytes-limit)
would be overridden, delaying compaction acceleration.

Key improvements:
1. Conditional override of RocksDB thresholds:
  - level0-slowdown-writes-trigger is overridden by l0-file-threshold
only if it is smaller.
  - soft-pending-compaction-bytes-limit is overridden only if it is
smaller than storage.flow-control.soft-pending-compaction-bytes-limit.
This ensures that increasing flow-control settings does not weaken
compaction acceleration, while user-configured RocksDB thresholds that
are larger than the flow-control limits are overriden, allowing compaction
speed-up to trigger before write flow control.
flow control.
3. Updated write stall check:
  - ingest_maybe_slowdown_writes now uses level0-stop-writes-trigger
instead of level0-slowdown-writes-trigger to determine whether ingest
may trigger a write stall.
  - This keeps the original behavior, since `l0-file-threshold` overrides
`level0-stop-writes-trigger`, just like the previous behavior with
`level0-slowdown-writes-trigger`. Ideally, flow-control settings would
be used directly to determine write stalls, but
`ingest_maybe_slowdown_writes` cannot access the flow-control module
configuration because this function resides inside the Engine module。

After this change, write control effectively has three stages:
1. Compaction acceleration: triggered when RocksDB thresholds are reached.
2. Flow control: triggered at storage.flow-control.l0-file-threshold and
storage.flow-control.soft-pending-compaction-bytes-limit.
4. Stop writes: triggered at storage.flow-control.hard-pending-compaction-bytes-limit.

Signed-off-by: hhwyt <hhwyt1@gmail.com>

Co-authored-by: hhwyt <hhwyt1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decouple l0-file-threshold and level0-slowdown-writes-trigger

7 participants