Skip to content

sql: create sql stats compaction scheduled job#68401

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
Azhng:sqlstats-gc
Aug 27, 2021
Merged

sql: create sql stats compaction scheduled job#68401
craig[bot] merged 1 commit intocockroachdb:masterfrom
Azhng:sqlstats-gc

Conversation

@Azhng
Copy link
Copy Markdown
Contributor

@Azhng Azhng commented Aug 3, 2021

Depends on #68434, #69346

sql: create sql stats compaction scheduled job

This commit introduces SQL Stats Compaction Scheduled job, where it runs
in an hourly schedule and invokes the SQL Stats Compaction Job that was
created in the previous commit.
This commit also introduces crdb_internal.schedule_sql_stats_compaction()
builtin as a way to manually create SQL Stats compaction schedule.
SHOW SCHEDULES command'syntax is also extended to support
SHOW SCHEDUELS FOR SQL STATISTICS.

Release justification: category 4

Release note (sql change): introduce
crdb_internal.schedule_sql_stats_compaction() to manually create SQL Stats
compaction schedule. Extend SHOW SCHEDULES command to support
SHOW SCHEDULES FOR SQL STATISTICS.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@Azhng Azhng changed the title [draft][wip] sql: implement sql stats compaction [draft][wip] sql: implement barebone sql stats clean up Aug 3, 2021
@Azhng Azhng force-pushed the sqlstats-gc branch 5 times, most recently from 3a27499 to 069d2af Compare August 4, 2021 17:49
@Azhng Azhng changed the title [draft][wip] sql: implement barebone sql stats clean up sql: implement barebone sql stats clean up Aug 4, 2021
@Azhng Azhng changed the title sql: implement barebone sql stats clean up sql: create sql stats compaction scheduled job Aug 4, 2021
@Azhng Azhng force-pushed the sqlstats-gc branch 2 times, most recently from 665575d to 3b685ee Compare August 4, 2021 20:17
@Azhng Azhng requested a review from a team August 4, 2021 20:20
@Azhng Azhng marked this pull request as ready for review August 4, 2021 20:20
@Azhng Azhng requested a review from a team August 4, 2021 20:20
@Azhng Azhng force-pushed the sqlstats-gc branch 2 times, most recently from dcbb44f to e5ad236 Compare August 11, 2021 13:53
@ajwerner
Copy link
Copy Markdown
Contributor

@cameronnunez you should look at this PR. It seems to elegantly use the scheduled jobs infrastructure to periodically run a task.

@cameronnunez
Copy link
Copy Markdown
Contributor

@cameronnunez you should look at this PR. It seems to elegantly use the scheduled jobs infrastructure to periodically run a task.

@ajwerner thanks for the mention!

@Azhng Azhng requested review from a team and nihalpednekar and removed request for a team August 12, 2021 02:10
@Azhng Azhng force-pushed the sqlstats-gc branch 3 times, most recently from 9effc03 to 1b8ce35 Compare August 13, 2021 00:04
@Azhng Azhng requested a review from a team as a code owner August 13, 2021 00:04
@Azhng Azhng requested a review from a team August 13, 2021 00:04
Copy link
Copy Markdown
Contributor Author

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @miretskiy)


pkg/sql/delegate/show_schedules.go, line 61 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

i don't think we have tests for this delegate... I assume you've verified you can correctly display schedule information.

Hmm I dig through the parser code, and I think this is actually dead code. Since I would need to introduce a new keyword in the parser grammer to represent the sql stats compaction ExecutorType.

As of now, SHOW SCHEDULES and SHOW SCHEDULE <schedule> work without problem.

I'm not too sure what's our guideline in introducing new SQL keywords in our grammar. @jordanlewis do you know if we have any guideline for this?


pkg/sql/sqlstats/persistedsqlstats/scheduled_job_monitor.go, line 173 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

does this need to be exported?

It's mostly because the testing code lives in the *_test package, which makes accessing this method inconvenient. I'm slightly leaning against simply stubbing out the input *jobs.ScheduledJob since then we would not have the opportunity to end-to-end test the use case where user pause the job or change the schedule expr through cluster setting.


pkg/sql/sqlstats/persistedsqlstats/scheduled_job_monitor.go, line 181 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I wonder if the error messages returned here should be more descriptive... Maybe say something to the effect of having these anomalizes.. (e.g. "disable stats compaction may cause performance to be bad"... etc)..

added descriptive log message in the top level logging code.


pkg/sql/sqlstats/persistedsqlstats/scheduled_sql_stats_compaction_test.go, line 180 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

it would be nice to stick in a "show schedule $1" -- just to check we don't blow up.
Extra points if you can validate correctness.

Done.


pkg/sql/sqlstats/persistedsqlstats/scheduled_sql_stats_compaction_test.go, line 193 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

perhaps also load schedule itself; verify it's recurrence is what you expect.

Done.


pkg/sql/sqlstats/persistedsqlstats/scheduled_sql_stats_compaction_test.go, line 206 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

perhaps add a test case when schedule next run is set via direct update system.scheduled_jobs -- you should still detect anomaly.

Hmm. It seems like if user directly manipulates the schedule_expr column in the system.scheduled_jobs table, the next_run column will not be updated as effect.

This feels very dangerous and I'm not sure if this is something we should discourage user from doing at all.

@miretskiy miretskiy requested a review from jordanlewis August 25, 2021 11:40
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 28 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @jordanlewis, and @miretskiy)


pkg/sql/sqlstats/persistedsqlstats/scheduled_job_monitor.go, line 99 at r6 (raw file):

			}
			if err = CheckScheduleAnomaly(sj); err != nil {
				log.Warningf(ctx, "schedule anomaly detected, disable sql stats compaction may cause performance impact: %s", err)

nit: s/disable/disabled/


pkg/sql/sqlstats/persistedsqlstats/scheduled_sql_stats_compaction_test.go, line 206 at r3 (raw file):

Previously, Azhng (Archer Zhang) wrote…

Hmm. It seems like if user directly manipulates the schedule_expr column in the system.scheduled_jobs table, the next_run column will not be updated as effect.

This feels very dangerous and I'm not sure if this is something we should discourage user from doing at all.

Well, for better or worse (well, for worse for sure), we don't have "update schedule" command. We do have pause schedule ; but the users can also update table and set next run = null. So... I think, yes, it's discouraged; and in general our users don't go updating system tables directly, but it's possible. And I believe your anomaly code detects this just fine. The NextRun() returned from the schedule will reflect the value in the next_run column, so, a direct update should be detected by your code.

@miretskiy miretskiy self-requested a review August 25, 2021 11:41
Copy link
Copy Markdown
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 22 of 28 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @jordanlewis, and @miretskiy)


pkg/clusterversion/cockroach_versions.go, line 529 at r6 (raw file):

	{
		Key:     SQLStatsCompactionScheduledJob,
		Version: roachpb.Version{Major: 21, Minor: 1, Internal: 150},

shouldn't be minor:2 ?


pkg/sql/compact_sql_stats.go, line 103 at r6 (raw file):

// maybeNotifyJobTerminated will notify the job termination
// (with termination status)

nit: period


pkg/sql/compact_sql_stats.go, line 153 at r6 (raw file):

var _ metric.Struct = &sqlStatsCompactionMetrics{}

// MetricStruct implements metric.Struct interface

nit: period


pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 64 at r6 (raw file):

).WithPublic()

// SQLStatsCleanupRecurrence is the cron-tab string specifying the the recurrence

nit: typo the the


pkg/sql/sqlstats/persistedsqlstats/provider.go, line 79 at r6 (raw file):

	lastFlushStarted time.Time

nit: remove blank line

Copy link
Copy Markdown
Contributor Author

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @maryliag, and @miretskiy)


pkg/clusterversion/cockroach_versions.go, line 529 at r6 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

shouldn't be minor:2 ?

I think for some historical reasons we just continue using 21.1 for the versioning.


pkg/sql/compact_sql_stats.go, line 103 at r6 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: period

Done.


pkg/sql/compact_sql_stats.go, line 153 at r6 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: period

Done.


pkg/sql/sqlstats/persistedsqlstats/cluster_settings.go, line 64 at r6 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: typo the the

Done.


pkg/sql/sqlstats/persistedsqlstats/provider.go, line 79 at r6 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: remove blank line

Done.

@Azhng
Copy link
Copy Markdown
Contributor Author

Azhng commented Aug 25, 2021

@miretskiy for some reason reviewable is not picking up your comment.

nit: s/disable/disabled/

Done.

Well, for better or worse (well, for worse for sure), we don't have "update schedule" command. We do have pause schedule ; but the users can also update table and set next run = null. So... I think, yes, it's discouraged; and in general our users don't go updating system tables directly, but it's possible. And I believe your anomaly code detects this just fine. The NextRun() returned from the schedule will reflect the value in the next_run column, so, a direct update should be detected by your code.

Updated the test.

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 28 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @jordanlewis, @maryliag, and @miretskiy)

@Azhng Azhng force-pushed the sqlstats-gc branch 2 times, most recently from 2687616 to 4baf092 Compare August 26, 2021 04:51
Copy link
Copy Markdown
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 10 files at r7, 16 of 28 files at r9, 13 of 15 files at r10, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @jordanlewis, @maryliag, and @miretskiy)

@Azhng
Copy link
Copy Markdown
Contributor Author

Azhng commented Aug 26, 2021

Discussed with @jordanlewis offline. Extended SHOW SCHEDULES command to support SHOW SCHEDULES FOR SQL STATISTICS in addition to SHOW SCHEDULES FOR BACKUP.

root@:26257/defaultdb> show schedules for sql statistics;
          id         |        label         | schedule_status |        next_run        |   state   | recurrence | jobsrunning | owner |            created
---------------------+----------------------+-----------------+------------------------+-----------+------------+-------------+-------+--------------------------------
  687722807522623489 | sql-stats-compaction | ACTIVE          | 2021-08-26 15:00:00+00 | succeeded | @hourly    |           0 | node  | 2021-08-26 02:59:04.841398+00
(1 row)

@maryliag can you take another look:

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 15 files at r10, 5 of 12 files at r11.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Azhng, @jordanlewis, and @maryliag)


pkg/sql/delegate/show_schedules.go, line 60 at r11 (raw file):

	case tree.ScheduledSQLStatsCompactionExecutor:
		whereExprs = append(whereExprs, fmt.Sprintf(
			"executor_type = '%s'", tree.ScheduledSQLStatsCompactionExecutor.InternalName()))

It's okay to not include "command" here -- I guess there is only 1.
But I wonder, would you ever want to show some additional information? Perhaps some stats about stats compactor job run? You can, for example, update this schedule upon completion of scheduled job with some interesting data.

Anyway, this is just a suggestion; and definitely does not need to happen now.


pkg/sql/parser/sql.y, line 5368 at r11 (raw file):

  {
    $$.val = tree.ScheduledSQLStatsCompactionExecutor
  }

You should probably update pkg/sql/parser/testdata/control_job to include this new flavor.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Azhng, @jordanlewis, and @maryliag)


pkg/sql/delegate/show_schedules.go, line 55 at r11 (raw file):

	case tree.ScheduledBackupExecutor:
		whereExprs = append(whereExprs, fmt.Sprintf(
			"executor_type = '%s'", tree.ScheduledBackupExecutor.InternalName()))

Throughout, when inserting a SQL string inside SQL syntax, use lex.EncodeSQLString.

Copy link
Copy Markdown
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new changes introducing SHOW SCHEDULES FOR SQL STATISTICS look good. You should add some test to them similar to the ones on create_scheduled_backup_test

Reviewed 1 of 28 files at r9, 1 of 15 files at r10, 7 of 12 files at r11, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Azhng and @jordanlewis)

Copy link
Copy Markdown
Contributor Author

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Azhng, @jordanlewis, @knz, @maryliag, and @miretskiy)


pkg/sql/delegate/show_schedules.go, line 55 at r11 (raw file):

Previously, knz (kena) wrote…

Throughout, when inserting a SQL string inside SQL syntax, use lex.EncodeSQLString.

Discussed offline and filed #69428


pkg/sql/delegate/show_schedules.go, line 60 at r11 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

It's okay to not include "command" here -- I guess there is only 1.
But I wonder, would you ever want to show some additional information? Perhaps some stats about stats compactor job run? You can, for example, update this schedule upon completion of scheduled job with some interesting data.

Anyway, this is just a suggestion; and definitely does not need to happen now.

Good point, filed #69429


pkg/sql/parser/sql.y, line 5368 at r11 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

You should probably update pkg/sql/parser/testdata/control_job to include this new flavor.

Done.

This commit introduces SQL Stats Compaction Scheduled job, where it runs
in an hourly schedule and invokes the SQL Stats Compaction Job that was
created in the previous commit.
This commit also introduces `crdb_internal.schedule_sql_stats_compaction()`
builtin as a way to manually create SQL Stats compaction schedule.
`SHOW SCHEDULES` command'syntax is also extended to support
`SHOW SCHEDUELS FOR SQL STATISTICS`.

Release justification: category 4

Release note (sql change): introduce
crdb_internal.schedule_sql_stats_compaction() to manually create SQL Stats
compaction schedule. Extend SHOW SCHEDULES command to support
SHOW SCHEDULES FOR SQL STATISTICS.
@Azhng
Copy link
Copy Markdown
Contributor Author

Azhng commented Aug 27, 2021

TFTR!

bors r=miretskiy,maryliag

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 27, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants