Skip to content

executor: fix load data losing connection when batch_dml_size is set#22724

Merged
ti-srebot merged 7 commits intopingcap:masterfrom
guo-shaoge:master
Feb 5, 2021
Merged

executor: fix load data losing connection when batch_dml_size is set#22724
ti-srebot merged 7 commits intopingcap:masterfrom
guo-shaoge:master

Conversation

@guo-shaoge
Copy link
Collaborator

@guo-shaoge guo-shaoge commented Feb 4, 2021

What problem does this PR solve?

Issue Number: Fix #22540

Problem Summary:
When tidb_dml_batch_size is a relatively small(eg: 128/64), load data into a table which has a auto_random column may got lost connection error.

What is changed and how it works?

What's Changed:

  1. add InsertValues::isLoadData member
  2. add InsertValues::txnInUse (a mutex) member
  3. lock txnInUse when commit routine needs to refresh txn ctx
  4. lock txnInUse when process stream routine needs to generate auto random value

How it Works:

load data use two routine:

  1. process stream routine: parse file and generate batch insert task
  2. commit routine: read task from channel and do commit

Commit routine will invalid txn after task is committed. And before generate a new txn, process stream routine may use that invalid txn to generate audo random value. So we add a lock to protect txn, make sure process stream routine use a valid txn.

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
    • TestLoadDataAutoRandom
  • Manual test (add detailed scripts or steps below)
    1. generate csv files(5W rows)
    2. set @@session.tidb_dml_batch_size = 128;
    3. drop table if exists t;
    4. create table t(c1 bigint auto_random primary key, c2 bigint, c3 bigint);
    5. load data local infile %q into table t (c2, c3);
  • No code

Side effects

  • Performance regression
    • add a lock when commit task and generate auto random, may got lock conflict and slow down load data performance.

Release note

  • fix load data lost connection error on tables with auto_random column

@guo-shaoge guo-shaoge requested a review from a team as a code owner February 4, 2021 06:04
@guo-shaoge guo-shaoge requested review from XuHuaiyu and removed request for a team February 4, 2021 06:04
@ti-srebot ti-srebot added the first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. label Feb 4, 2021
@CLAassistant
Copy link

CLAassistant commented Feb 4, 2021

CLA assistant check
All committers have signed the CLA.

@sre-bot
Copy link
Contributor

sre-bot commented Feb 4, 2021

@sre-bot
Copy link
Contributor

sre-bot commented Feb 4, 2021

@guo-shaoge
Copy link
Collaborator Author

/rebuild

@github-actions github-actions bot added sig/execution SIG execution sig/sql-infra SIG: SQL Infra labels Feb 4, 2021
@guo-shaoge
Copy link
Collaborator Author

/run-all-tests

@guo-shaoge
Copy link
Collaborator Author

@wshwsh12

@guo-shaoge
Copy link
Collaborator Author

@AilinKid

@XuHuaiyu XuHuaiyu added the type/bugfix This PR fixes a bug. label Feb 4, 2021
@XuHuaiyu XuHuaiyu requested review from AilinKid and wshwsh12 February 4, 2021 08:00
dbt.mustExec("create table t(c1 bigint auto_random primary key, c2 bigint, c3 bigint)")
dbt.mustExec(fmt.Sprintf("load data local infile %q into table t (c2, c3)", path))
rows := dbt.mustQuery("select count(*) from t")
cli.checkRows(c, rows, "50000")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to check the correctness of the data.

Maybe we can use select bit_xor(c1), bit_xor(c2), bit_xor(c3) from t

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. (only c2 and c3 is verified. Because c1 is auto_random column, no way to know its value)

ts.runTestLoadDataForListColumnPartition2(c)
}

func (ts *tidbTestSerialSuite) TestLoadDataBatchDML(c *C) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment for this test, e.g.

// fix issue22540

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@XuHuaiyu XuHuaiyu changed the title executor: load data lost connection when batch_dml_size is set (#22540) executor: fix load data losing connection when batch_dml_size is set Feb 4, 2021
@ichn-hu ichn-hu mentioned this pull request Feb 4, 2021
@guo-shaoge
Copy link
Collaborator Author

/run-all-tests

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 4, 2021
Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Feb 5, 2021
@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Feb 5, 2021

Have we tested the performance regression?

@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Feb 5, 2021

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Feb 5, 2021
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #22736

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Feb 5, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0-rc in PR #22737

guo-shaoge added a commit to ti-srebot/tidb that referenced this pull request Feb 5, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. sig/execution SIG execution sig/sql-infra SIG: SQL Infra status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

load data with auto-random may panic if too many rows

6 participants