Skip to content

ttl: implement ttl job schedule framework#39472

Merged
ti-chi-bot merged 2 commits intopingcap:masterfrom
YangKeao:add-ttl-job-schedule
Dec 12, 2022
Merged

ttl: implement ttl job schedule framework#39472
ti-chi-bot merged 2 commits intopingcap:masterfrom
YangKeao:add-ttl-job-schedule

Conversation

@YangKeao
Copy link
Member

@YangKeao YangKeao commented Nov 29, 2022

What problem does this PR solve?

Issue Number: close #39471

Problem Summary:

Implement a ttl job schedule framework. It reads information from the info schema and the mysql.tidb_ttl_table_status, gets the job to run and schedule them to a scan worker.

The implementation of variable settings, scanWorker and delWorker are left for future development.

What is changed and how it works?

For most parts, the job manager is split from the original demo: https://github.com/pingcap/tidb/pull/39320/files. It implements the following feature:

  1. The most fundamental one. Let the job run on the worker.
  2. Support graceful shutdown with the context from domain.
  3. Allow collecting the errors from the tasks, and storing them into the table when the job is finished.
  4. Allow cancelling the running jobs.

This pr left a lot of TODO for system variables. They are left in #39727.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

@YangKeao YangKeao added the release-note-none Denotes a PR that doesn't merit a release note. label Nov 29, 2022
@YangKeao YangKeao requested a review from a team as a code owner November 29, 2022 14:19
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Nov 29, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • hawkingrei
  • lcwangchao

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 29, 2022
@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch 2 times, most recently from 15885a8 to 8b275b1 Compare November 30, 2022 04:57
@YangKeao YangKeao requested a review from lcwangchao November 30, 2022 05:06
@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch 4 times, most recently from a359fc7 to 1d0d7c2 Compare November 30, 2022 05:31
@YangKeao YangKeao requested a review from hawkingrei November 30, 2022 05:35
@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch from 1d0d7c2 to 8eb9302 Compare November 30, 2022 05:38
@hawkingrei
Copy link
Member

@YangKeao

ttl/ttlworker/del.go:35:2: should use for range instead of for { select {} } (S1000)

ttl/ttlworker/job_manager.go:423:9: ineffectual assignment to err (ineffassign)

@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch from 8eb9302 to 1d5d30d Compare December 1, 2022 03:50
@YangKeao YangKeao marked this pull request as draft December 1, 2022 08:39
@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 1, 2022
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2022
@YangKeao YangKeao added the affects-6.5 This bug affects the 6.5.x(LTS) versions. label Dec 2, 2022
@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch from 1d5d30d to c9fc7f9 Compare December 5, 2022 13:22
@ti-chi-bot ti-chi-bot added the needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. label Dec 5, 2022
@YangKeao YangKeao removed the request for review from a team December 6, 2022 04:30
@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch 4 times, most recently from 4b46c4a to c1effd3 Compare December 6, 2022 10:46
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 6, 2022
@YangKeao YangKeao marked this pull request as ready for review December 6, 2022 10:49
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 6, 2022
@YangKeao YangKeao requested a review from lcwangchao December 9, 2022 06:29
@hawkingrei
Copy link
Member

/run-mysql-test

@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch 3 times, most recently from e0be974 to c091e49 Compare December 9, 2022 07:31
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems when a job's hbtime stops to update, a new job with a new ttl expire will be created. I'm not sure whether we should make a little more complex strategy to make the job can be "resume" instead a new job. This maybe useful in some cases, for example, if we have a bug that a job will crash after running for 1hour, it still can guarantee a job will be finished finally and make sure the "cleared watermark" can be pushed forward.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to implement the "resume", another thing we should concern is that if a very old job failed, we should not resume it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tableStatus.CurrentJobOwnerID != m.id means this job is take over by other worker, so is it necessary to reset the job's status in this node?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset the job's status in this node

Remove it from the running job and cancelling all running tasks are the only "status" local . The manager cannot get the job anymore 🤔, so I don't think other states matter.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 9, 2022
@lcwangchao
Copy link
Collaborator

Tough we still have some details to discuss, but I think we can merge it first to do some tests.

@YangKeao YangKeao force-pushed the add-ttl-job-schedule branch 3 times, most recently from 6c68a80 to ab70f7f Compare December 9, 2022 07:56
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Dec 12, 2022
@YangKeao
Copy link
Member Author

@hawkingrei @lcwangchao Could you help me to merge this PR?

@YangKeao
Copy link
Member Author

/hold

I found a bug 🤦

Signed-off-by: YangKeao <yangkeao@chunibyo.icu>
@YangKeao
Copy link
Member Author

/unhold

@lcwangchao
Copy link
Collaborator

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

DetailsCommit hash: 41326e0

@YangKeao
Copy link
Member Author

/run-unit-test.

@wuhuizuo
Copy link
Contributor

/run-unit-test

1 similar comment
@wuhuizuo
Copy link
Contributor

/run-unit-test

@ti-chi-bot
Copy link
Member

hello from wuhuizuo

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #39830.

@sre-bot
Copy link
Contributor

sre-bot commented Dec 12, 2022

TiDB MergeCI notify

✅ Well Done! New fixed [1] after this pr merged.

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci-tidb/common-test 🔴 failed 1, success 10, total 11 11 min Existing failure
idc-jenkins-ci-tidb/integration-compatibility-test ✅ all 1 tests passed 3 min 22 sec Fixed
idc-jenkins-ci-tidb/integration-common-test 🟢 all 17 tests passed 22 min Existing passed
idc-jenkins-ci/integration-cdc-test 🟢 all 40 tests passed 19 min Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 6 min 20 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 6 min 15 sec Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 4 min 51 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 4 min 20 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 3 min 49 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects-6.5 This bug affects the 6.5.x(LTS) versions. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TTL job manager to schedule the pending jobs periodically

6 participants