store,kv: snapshot doesn't cache the non-exists kv entries lead to poor 'insert ignore' performance#12872
Conversation
…or 'insert ignore' performance Snapshot should cache all the BatchGet() kv entries, including the non-exist ones. This commit fix a bug that non-exist kv entries are not cached, causing `insert ignore` to access TiKV everytime and the performance is poor.
|
PTAL @jackysp @crazycs520 |
|
Could you add benchamrk result here? |
I'm making a comparison... if those I create a table like this: and fill the table with 100w rows: and create another table, set This result is under the TiDB binary of this commit. |
|
By the way, I mock a 1ms network latency (for both) because I'm testing on my laptop (no so obvious penalty for networking). I guess at least 20mins have passed away. |
|
The result is here, use master branch @shenli |
Codecov Report
@@ Coverage Diff @@
## master #12872 +/- ##
===============================================
+ Coverage 80.0693% 80.456% +0.3866%
===============================================
Files 465 465
Lines 107262 109333 +2071
===============================================
+ Hits 85884 87965 +2081
+ Misses 14934 14920 -14
- Partials 6444 6448 +4 |
|
/run-all-tests |
|
PTAL @crazycs520 @coocood |
|
@tiancaiamao |
jackysp
left a comment
There was a problem hiding this comment.
could we add a unit test case, e.g.trace fomat='row' insert ignore?
That's exactly the bug! |
The result is difficult to check. @jackysp |
It's cached, after this operation, |
3fcb401 to
e5c36eb
Compare
|
LGTM |
We could check if it send the kv request in the result set. |
|
/run-all-tests
There are always SendRequest in the result set, before or after this PR! |
Any idea about how to write that test? |
64ea7eb to
e85de23
Compare
|
PTAL @jackysp |
|
LGTM |
|
/run-all-tests |
|
cherry pick to release-3.0 failed |
|
cherry pick to release-3.1 failed |
…or 'insert ignore' performance (pingcap#12872)
…ect/tidb into feature-add-udf-support * 'feature-add-udf-support' of https://github.com/JustProject/tidb: (26 commits) *: fix bug that the kill command doesn't work when the killed session is waiting for the pessimistic lock (pingcap#12852) executor: fix the projection upon the indexLookUp in indexLookUpJoin can't get result. (pingcap#12889) planner, executor: support create view on union (pingcap#12595) planner/cascades: introduce TransformationID in cascades planner (pingcap#12879) executor: fix data race in test (pingcap#12910) executor: reuse chunk row for insert on duplicate update (pingcap#12847) ddl: speed up tests (pingcap#12888) executor: speed up test (pingcap#12896) expression: implement vectorized evaluation for `builtinSecondSig` (pingcap#12886) expression: implement vectorized evaluation for `builtinJSONObjectSig` (pingcap#12663) expression: speed up builtinRepeatSig by using MergeNulls (pingcap#12674) expression: speed up unit tests under the expression package (pingcap#12887) store,kv: snapshot doesn't cache the non-exists kv entries lead to poor 'insert ignore' performance (pingcap#12872) executor: fix data race in `GetDirtyTable()` (pingcap#12767) domain: increase TTL to reduce the occurrence of reporting min startTS errors (pingcap#12578) executor: split test for speed up (pingcap#12881) executor: fix inconsistent of grants privileges with MySQL when executing `grant all on ...` (pingcap#12330) expression: implement vectorized evaluation for `builtinJSONUnquoteSig` (pingcap#12841) tune grpc connection count between tidb and tikv (pingcap#12884) Makefile: change test parallel to 8 (pingcap#12885) ...
…or 'insert ignore' performance pingcap#12872
…or 'insert ignore' performance (pingcap#12872)
What problem does this PR solve?
The Snapshot should cache all the BatchGet() KV entries, including the non-exist ones.
This commit fixes a bug (start from #12108) that non-exist KV entries are not cached, causing
insert ignoreto access TiKV every time and the performance is poor.What is changed and how it works?
The confusing API convention leads to this bug. Some places use len(value) = 0 in map[key]value to represent non-exist, some places use
_, ok := map[key]value; !okto represent non-exist ... and in the end, we forget to cache the non-exist information.After this change, we can see that the
tikvSnapshot.getdoesn't call rpcSendRequestanymore.See the tracing result to verify that:
Check List
Tests
This is not a bug related to correctness.
Side effects
Related changes
Release note