import: relax memory check constraints by 3pointer · Pull Request #18248 · tikv/tikv

3pointer · 2025-02-24T10:16:40Z

What is changed and how it works?

Issue Number: ref #18124

What's Changed:
The previously (#18192) implemented memory check was too conservative, leading to unnecessary rejections even when there was still available memory.

For example, on an 8GB node, the memory_usage_limit was calculated as 6GB, and applying the 0.9 threshold effectively resulted in a 5.4GB limit, which is only 67.5% of the total memory. This led to premature rejection of import requests, even when the system still had usable memory.

import: make memory check more aggressive.

Related changes

PR to update pingcap/docs/pingcap/docs-cn:
Need to cherry-pick to the release branch

Check List

Tests

[] Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Release note

None

Signed-off-by: 3pointer <luancheng@pingcap.com>

ti-chi-bot · 2025-02-24T10:16:45Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Signed-off-by: 3pointer <luancheng@pingcap.com>

Tristan1900

Thanks for the change! just a couple comments from my side

Tristan1900 · 2025-02-24T16:23:53Z

src/import/sst_service.rs

+        SysQuota::memory_limit_in_bytes()
+    })();
+
+    if mem_limit == 0 || mem_limit < usage {


it seems mem_limit could be -1 if unlimited when using cgroup. maybe can have better logging for that case?

mem_limit is u64, so I think it cannot be -1

Tristan1900 · 2025-02-24T16:29:56Z

src/import/sst_service.rs

                Ok(()) => (),
                Err(e) => {
+                    // in case of immediate retry from client side
+                    tokio::time::sleep(Duration::from_secs(1)).await;


should we adjust the backoff strategy on the client side instead of injecting delay on the server side?
holding a bunch of calls in memory for 1 sec could potentially increase memory pressure,
also sleep for 1s possibly will not solve the burst issue but only delay the burst for 1s, if we want to smooth out we need to add a jitter to the delay. I think jitter can already be configured at the backoff strategy on client side.

Actually client side has a backoff strategy. So I'm not sure that's the root cause. anyway I'll test it

anyway, add a jitter is more solid, I'll add and test it

v01dstar · 2025-02-25T06:26:23Z

I think the PR's title contradicts what this PR is actually doing. "Relax memory check constraints" may be a better name.

v01dstar · 2025-02-25T06:52:40Z

src/import/sst_service.rs

+    // Reject ONLY if BOTH:
+    // - Available memory is below REJECT_SERVE_MEMORY_USAGE
+    // - Memory usage ratio is 90%+
+    if mem_limit - usage < REJECT_SERVE_MEMORY_USAGE


nit: I feel the following version is more readable.

let free_memory = mem_limit - usage; let min_required_memory = std::cmp::min( REJECT_SERVE_MEMORY_USAGE, ((1.0 - HIGH_IMPORT_MEMORY_WATER_RATIO) * mem_limit as f64) as u64 ); if free_memory < min_required_memory { let usage_ratio = usage as f64 / mem_limit as f64; return Err(Error::ResourceNotEnough(format!( "Memory usage too high, usage: {} bytes, mem limit {} bytes", usage, mem_limit ))); }

Signed-off-by: 3pointer <luancheng@pingcap.com>

YuJuncen

Not pretty sure whether delay in the server is a good idea... Other LGTM

ti-chi-bot · 2025-02-26T09:04:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Leavrth, YuJuncen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [YuJuncen]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2025-02-26T09:04:47Z

[LGTM Timeline notifier]

Timeline:

2025-02-26 07:05:37.412773494 +0000 UTC m=+425885.365931745: ☑️ agreed by YuJuncen.
2025-02-26 09:04:46.01383 +0000 UTC m=+433033.966988264: ☑️ agreed by Leavrth.

This reverts commit 71aecc2.

ref tikv#18124 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot · 2025-12-09T07:00:18Z

In response to a cherrypick label: new pull request created to branch release-8.5: #19189.
But this PR has conflicts, please resolve them!

3pointer added 3 commits February 24, 2025 17:39

make sst_service memory check more aggressive

39d09cd

Signed-off-by: 3pointer <luancheng@pingcap.com>

update test cases

a9b21c9

Signed-off-by: 3pointer <luancheng@pingcap.com>

fix useless code

f65ec92

Signed-off-by: 3pointer <luancheng@pingcap.com>

ti-chi-bot bot added the do-not-merge/needs-linked-issue label Feb 24, 2025

3pointer force-pushed the aggressive_memory_check branch from 8884a89 to a9b21c9 Compare February 24, 2025 10:17

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/needs-linked-issue labels Feb 24, 2025

3pointer marked this pull request as ready for review February 24, 2025 10:19

ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2025

add 1s sleep before return error

92cd421

Signed-off-by: 3pointer <luancheng@pingcap.com>

Tristan1900 reviewed Feb 24, 2025

View reviewed changes

v01dstar reviewed Feb 25, 2025

View reviewed changes

address comments

00554ea

Signed-off-by: 3pointer <luancheng@pingcap.com>

YuJuncen approved these changes Feb 26, 2025

View reviewed changes

ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 26, 2025

3pointer changed the title ~~import: make memory check more aggressive.~~ import: relax memory check constraints Feb 26, 2025

Leavrth approved these changes Feb 26, 2025

View reviewed changes

ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 26, 2025

ti-chi-bot bot merged commit 71aecc2 into tikv:master Feb 26, 2025
8 checks passed

ti-chi-bot bot added this to the Pool milestone Feb 26, 2025

owlsome2501 mentioned this pull request Mar 18, 2025

There is a 15.4% performance regression in ycsb after PR#18248 #18317

Closed

3pointer added a commit to 3pointer/tikv that referenced this pull request Mar 20, 2025

Revert "import: relax memory check constraints (tikv#18248)"

f9a118a

This reverts commit 71aecc2.

YuJuncen mentioned this pull request Dec 9, 2025

backup/restore: pick necessary fixes targetting master only #19187

Closed

YuJuncen added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Dec 9, 2025

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Dec 9, 2025

This is an automated cherry-pick of tikv#18248

904eb7e

ref tikv#18124 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot mentioned this pull request Dec 9, 2025

import: relax memory check constraints (#18248) #19189

Closed

8 tasks

Conversation

3pointer commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is changed and how it works?

Related changes

Check List

Release note

Uh oh!

ti-chi-bot bot commented Feb 24, 2025

Uh oh!

Tristan1900 left a comment

Choose a reason for hiding this comment

Uh oh!

Tristan1900 Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

3pointer Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

Tristan1900 Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

3pointer Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

3pointer Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

v01dstar commented Feb 25, 2025

Uh oh!

v01dstar Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

YuJuncen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Feb 26, 2025

Uh oh!

ti-chi-bot bot commented Feb 26, 2025

[LGTM Timeline notifier]

Uh oh!

Uh oh!

ti-chi-bot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

3pointer commented Feb 24, 2025 •

edited

Loading

3pointer Feb 25, 2025 •

edited

Loading

YuJuncen left a comment •

edited

Loading