Skip to content

Batch (raw) get requests to same region across batch commands#5598

Merged
Little-Wallace merged 45 commits intotikv:masterfrom
tabokie:batch-get
Oct 29, 2019
Merged

Batch (raw) get requests to same region across batch commands#5598
Little-Wallace merged 45 commits intotikv:masterfrom
tabokie:batch-get

Conversation

@tabokie
Copy link
Member

@tabokie tabokie commented Oct 9, 2019

What have you changed?

  • Batch get / raw-get requests with same priority to the same region.

To avoid request timeout due to batch too large or short of timeslice, add one thread of interval timer and periodically issues a forced submit.

This change will reduce readpool tasking pressure and in turn cpu usage. The effect is relevent to gRPC connection count (contribute to qps in one stream) and batch wait duration.

What is the type of the changes?

  • Improvement (a change which is an improvement to an existing feature)

How is the PR tested?

  • Unit test

Does this PR affect documentation (docs) or should it be mentioned in the release notes?

No

Does this PR affect tidb-ansible?

Will do

Refer to a related PR or issue link (optional)

This PR is split from #5382

Benchmark result if necessary (optional)

improve 17% qps for oltp_point_select bench (grpc-connection-count=3, request-batch-wait-duration="1ms") without compromising other workloads.

improve 20% qps for oltp_point_select (grpc-connection-count=1 or 2, request-batch-wait-duration="1ms"), with a notable side effect on other workloads.

This PR has seen some major performance speed-ups on tikv master since it first started (#5363, #5654 ). Now request batch is only feasible under higher perssure.

Here is a result from sysbench oltp_point_select 1024 thread, on 1 tidb 1 tikv each on seperate server with 40 core. (with tidb point get grpc conn of 4)

before batch, tikv cpu ~ 900%

SQL statistics:
queries performed:
read: 82163942
write: 0
other: 0
total: 82163942
transactions: 82163942 (228121.25 per sec.)
queries: 82163942 (228121.25 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total time: 360.1744s
total number of events: 82163942

Latency (ms):
min: 0.23
avg: 4.49
max: 49.41
95th percentile: 7.17
sum: 368591486.94

Threads fairness:
events (avg/stddev): 80238.2246/61.32
execution time (avg/stddev): 359.9526/0.00

after batch, tikv cpu ~ 550%

SQL statistics:
queries performed:
read: 91619058
write: 0
other: 0
total: 91619058
transactions: 91619058 (254370.46 per sec.)
queries: 91619058 (254370.46 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total time: 360.1777s
total number of events: 91619058

Latency (ms):
min: 0.33
avg: 4.02
max: 45.34
95th percentile: 7.30
sum: 368587682.66

Threads fairness:
events (avg/stddev): 89471.7363/113.70
execution time (avg/stddev): 359.9489/0.00

Any examples? (optional)

Signed-off-by: tabokie <xy.tao@outlook.com>
@tabokie tabokie requested a review from zhangjinpeng87 October 9, 2019 06:09
@tabokie tabokie added the component/performance Component: Performance label Oct 9, 2019
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
@tabokie
Copy link
Member Author

tabokie commented Oct 9, 2019

/run-all-tests/bench

@mahjonp
Copy link
Contributor

mahjonp commented Oct 9, 2019

/bench tidb=pr/12569

Err(e) => return future::err(e),
};
for get in gets {
callback(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we only use one callback for these get requests to reduce the cost?

Copy link
Member Author

@tabokie tabokie Oct 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update roughly achieves -50% gRPC poll CPU and +9% qps in oltp_point_select.

let minibatcher = MiniBatcher::new(tx.clone(), self.minibatch_wait_millis);
let stopped = Arc::new(AtomicBool::new(false));
let minibatcher = Arc::new(Mutex::new(minibatcher));
if self.minibatch_wait_millis > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hicqu @BusyJay PTAL here

@BusyJay
Copy link
Member

BusyJay commented Oct 9, 2019

Can you add some comments to help reviewer better understand the code?

Signed-off-by: tabokie <xy.tao@outlook.com>
Copy link
Member

@BusyJay BusyJay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doing batch inside the kv service instead of storage?

@zhangjinpeng87
Copy link
Member

/bench tidb=pr/12569

@tabokie
Copy link
Member Author

tabokie commented Oct 9, 2019

Why doing batch inside the kv service instead of storage?

@BusyJay Currently the batch is collected within the lifetime of one batch_commands stream, and batch response depends on the liveness of this stream. To batch at a lower level (and to batch cross threads), each command must held their own reference to the message channel, and we have to make a thread-safe batcher, which I think is a bit costly. The downside is batch effect is too dependent on grpc connection (high connection leads to scattered batch).

Signed-off-by: tabokie <xy.tao@outlook.com>
@sre-bot
Copy link
Contributor

sre-bot commented Oct 9, 2019

tidb.grpc-connection-count=1, batch.timeout=0ms

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: be6163c823285eb7bd048c8d0377c7b05dfd464e
+++ tidb: a387fbf23e26f8f0dc6b140489d04860bc7ba942
--- tikv: 00b0234fecbe97f53700b125ce1f5793ba88531b
+++ tikv: 4eb0c21f0c91f1eefad69bbcc6fc743e3c3bf7a3
pd: 4acaa8c715d40a07f521dd85ef7dcd4118064289
================================================================================
test-1: < oltp_point_select >
    * QPS : 64453.41 ± 2.2211% (std=921.42) delta: -18.37% (p=0.000)
    * AvgMs : 3.97 ± 2.2670% (std=0.06) delta: 22.46%
    * PercentileMs99 : 8.28 ± 0.0000% (std=0.00) delta: 24.14%
            
test-2: < oltp_read_write >
    * QPS : 25687.45 ± 0.8507% (std=150.98) delta: -32.83% (p=0.000)
    * AvgMs : 200.13 ± 0.8644% (std=1.19) delta: 48.86%
    * PercentileMs99 : 312.21 ± 1.0781% (std=2.75) delta: 21.03%
            
test-3: < oltp_insert >
    * QPS : 9036.30 ± 1.8150% (std=119.24) delta: -59.13% (p=0.000)
    * AvgMs : 28.33 ± 1.8002% (std=0.37) delta: 144.70%
    * PercentileMs99 : 46.91 ± 1.1938% (std=0.40) delta: 97.28%
            
test-4: < oltp_update_index >
    * QPS : 7239.76 ± 2.6838% (std=156.80) delta: -58.45% (p=0.000)
    * AvgMs : 35.32 ± 2.8537% (std=0.72) delta: 143.50%
    * PercentileMs99 : 48.78 ± 0.8919% (std=0.43) delta: 53.81%
            
test-5: < oltp_update_non_index >
    * QPS : 12779.03 ± 1.0768% (std=100.24) delta: -57.16% (p=0.000)
    * AvgMs : 20.03 ± 1.0815% (std=0.16) delta: 133.54%
    * PercentileMs99 : 29.19 ± 0.0000% (std=0.00) delta: 38.28%
            

https://perf.pingcap.com

@tabokie
Copy link
Member Author

tabokie commented Oct 9, 2019

/run-all-tests

@siddontang
Copy link
Contributor

oh, seem the perf reduced. @tabokie

Signed-off-by: tabokie <xy.tao@outlook.com>
@tabokie
Copy link
Member Author

tabokie commented Oct 9, 2019

oh, seem the perf reduced. @tabokie

@siddontang Don't sweat, this PR only batch get request, should focus on oltp_point_select. As for oltp_point_select, the batch timeout is set to default 0ms in that run, I don't expect a performance gain considering gRPC conn is set to 1.

@tabokie
Copy link
Member Author

tabokie commented Oct 9, 2019

/release make dist_release/bench tidb=pr/12569

@sre-bot
Copy link
Contributor

sre-bot commented Oct 9, 2019

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest looks fine to me

Signed-off-by: tabokie <xy.tao@outlook.com>
Little-Wallace
Little-Wallace previously approved these changes Oct 25, 2019
Copy link
Contributor

@Little-Wallace Little-Wallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

zhangjinpeng87 and others added 2 commits October 25, 2019 20:38
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
zhangjinpeng87 and others added 2 commits October 29, 2019 14:06
Signed-off-by: tabokie <xy.tao@outlook.com>
Copy link
Member

@zhangjinpeng87 zhangjinpeng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Little-Wallace Little-Wallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/performance Component: Performance component/server Component: Server component/storage Component: Storage, Scheduler, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants