Skip to content

sql: rewrite sql stats compaction job to avoid scanning mvcc garbage#80341

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
Azhng:sqlstats-faster-prune
Apr 28, 2022
Merged

sql: rewrite sql stats compaction job to avoid scanning mvcc garbage#80341
craig[bot] merged 1 commit intocockroachdb:masterfrom
Azhng:sqlstats-faster-prune

Conversation

@Azhng
Copy link
Copy Markdown
Contributor

@Azhng Azhng commented Apr 21, 2022

Related to #79548

Previously, SQL Stats cleanup job's performance could severely degrade
and cannot keep up with the write load, since its DELETE statements
often needed to scan over large range of MVCC garbage.
This commit addresses this issue by further constraining the scans of
the DELETE statements to reduce how much MVCC garbage it scans over.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@Azhng Azhng force-pushed the sqlstats-faster-prune branch from 23988fb to 5db1ea6 Compare April 22, 2022 20:14
@Azhng Azhng requested a review from a team April 22, 2022 22:06
@Azhng Azhng marked this pull request as ready for review April 22, 2022 22:06
@Azhng Azhng requested review from ajwerner and mgartner April 22, 2022 22:07
@Azhng Azhng mentioned this pull request Apr 26, 2022
5 tasks
Copy link
Copy Markdown
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 5 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @mgartner)


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 27 at r1 (raw file):

//  value.
var (
	// N.B.: statement and transaction statistics system tables has hash sharded

nit: have


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 69 at r1 (raw file):

				(
		      aggregated_ts,
					fingerprint_id,

nit: alignment


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 73 at r1 (raw file):

		      plan_hash,
		      app_name,
		      node_id                   

nit: remove trailing space


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 89 at r1 (raw file):

				(
		      aggregated_ts,
					fingerprint_id,

nit: alignment


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 91 at r1 (raw file):

					fingerprint_id,
		      app_name,
		      node_id                   

nit: remove trailing space


pkg/sql/sqlstats/persistedsqlstats/compaction_test.go line 216 at r1 (raw file):

				// The two interceptors (kvInterceptor and cleanupInterceptor) are
				// injected into kvserver and StatsCompactor respectively.
				// The cleanupInterceptor calculates number of expected "wide scan"

nit: calculates the number


pkg/sql/sqlstats/persistedsqlstats/compaction_test.go line 218 at r1 (raw file):

				// The cleanupInterceptor calculates number of expected "wide scan"
				// that should be issued by the StatsCompactor.
				// The kvInterceptor counts number of actual "wide scan" KV Request

nit: counts the number

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @Azhng)


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 168 at r1 (raw file):

		for remainToBeRemoved := rowsToRemove; remainToBeRemoved > 0; {
			if len(lastDeletedRow) > 0 {
				stmt = ops.subsequentDeleteStmt

nit: It might be clearer to create a method of cleanupOperations that returns the stmt given lastDeletedRow - if lastDeletedRow was empty it could return the wide version, otherwise the constrained version.


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 224 at r1 (raw file):

func getQargs(shardIdx, limit int64, lastDeletedRow tree.Datums) []interface{} {
	qargs := make([]interface{}, 0, len(lastDeletedRow)+2)

nit: you could make this a scratch field in StatsCompactor to reduce allocations.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 117 at r1 (raw file):

		}
	case "system.transaction_statistics":
		return &cleanupOperations{

You could defined these structs are singletons that can be accessed directly without the need of this cleanupOpForTable helper. I think it'd be nice to have the constrained and unconstrained queries side by side for reference.

@Azhng Azhng force-pushed the sqlstats-faster-prune branch 3 times, most recently from 68fc6e1 to b5f6d14 Compare April 26, 2022 22:30
Copy link
Copy Markdown
Contributor Author

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also moved query_utils.go back into compaction_exec.go

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @maryliag, and @mgartner)


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 168 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: It might be clearer to create a method of cleanupOperations that returns the stmt given lastDeletedRow - if lastDeletedRow was empty it could return the wide version, otherwise the constrained version.

Done.


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 224 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: you could make this a scratch field in StatsCompactor to reduce allocations.

Done.


pkg/sql/sqlstats/persistedsqlstats/compaction_test.go line 216 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: calculates the number

Done.


pkg/sql/sqlstats/persistedsqlstats/compaction_test.go line 218 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: counts the number

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 27 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: have

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 69 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: alignment

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 73 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: remove trailing space

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 89 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: alignment

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 91 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: remove trailing space

Done.


pkg/sql/sqlstats/persistedsqlstats/query_utils.go line 117 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

You could defined these structs are singletons that can be accessed directly without the need of this cleanupOpForTable helper. I think it'd be nice to have the constrained and unconstrained queries side by side for reference.

Done.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @Azhng)


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 165 at r2 (raw file):

) error {
	var err error
	lastDeletedRow := tree.Datums{}

nit: var lastDeletedRow tree.Datums should be sufficient here.


pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 284 at r2 (raw file):

	deleteFromStmtStatsWithWideScan = `
    DELETE FROM system.statement_statistics

nit: you could move these strings into the structs defined below to avoid a reader having to follow a trail to make sense of this.

Copy link
Copy Markdown
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @ajwerner and @Azhng)

Related to cockroachdb#79548

Previously, SQL Stats cleanup job's performance could severely degrade
and cannot keep up with the write load, since its `DELETE` statements
often needed to scan over large range of MVCC garbage.
This commit addresses this issue by further constraining the scans of
the `DELETE` statements to reduce how much MVCC garbage it scans over.

Release note: None
@Azhng
Copy link
Copy Markdown
Contributor Author

Azhng commented Apr 28, 2022

TFTR!

bors r=maryliag,mgartner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 28, 2022

🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

@rickystewart
Copy link
Copy Markdown
Collaborator

bors r=maryliag,mgartner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 28, 2022

Stopped waiting for PR status (Github check) without running due to duplicate requests to run. You may check Bors to see that this PR is included in a batch by one of the other requests.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 28, 2022

Build succeeded:

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 28, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from b01d45e to blathers/backport-release-21.2-80341: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants