Skip to content

sql: set default sample_rate to 0.1#61760

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:sample-rate
Mar 10, 2021
Merged

sql: set default sample_rate to 0.1#61760
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:sample-rate

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Mar 10, 2021

sql: do not reset planner if PREPARE is executed as a stmt

Previously, when executing a PREPARE command, we would always reset the
planner. This is not necessary when the origin of the command is SQL
because in that case we properly reset the planner before starting the
statement's execution. This change fixes a problem when PREPARE
statements are chosen to be sampled for execution statistics (without
the fix, we would encounter a nil pointer).

Release note: None

sql: set default sample_rate to 0.1

Fixes: #59379.

Release note (sql change): Default value for sql.txn_stats.sample_rate
cluster setting has been increased from 0 to 0.1. This means that from
now on every statement has 10% probability of being sampled for the
purposes of execution statistics. Note that no other criteria for
sampling (like the query latencies) are currently being utilized to
decide whether to sample a statement or not.

@yuzefovich yuzefovich requested review from a team and asubiotto March 10, 2021 00:51
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)

@yuzefovich
Copy link
Copy Markdown
Member Author

There was a problem with resetPlanner resetting the instrumentation which could lead to nil pointers, so I had to adjust the commit, so it needs another look. Also @RaduBerinde I'd appreciate if you took a look as well.

Another option to going around the issue that I see is introducing resetInstrumentation bool argument to resetPlanner signature, and set that to true only when calling from execStmtInOpenState. Not sure if it would be better. Also curious to hear other ideas.

The problem is that we first set up the instrumentation and decide that it needs to be finished, but then we might call resetPlanner pretty much right away which will overwrite whatever needs to be finished.

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @yuzefovich)


pkg/sql/conn_executor.go, line 2274 at r2 (raw file):

	p.stmt = Statement{}
	// Note that we deliberately do not reset p.instrumentation since we need to
	// have more fine-grained control over resetting that field.

This is surprising.. What field exactly was problematic? I'm worried we may hang on to various things here (e.g. traceMetadata, explainPlan).

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)


pkg/sql/conn_executor.go, line 2274 at r2 (raw file):

Previously, RaduBerinde wrote…

This is surprising.. What field exactly was problematic? I'm worried we may hang on to various things here (e.g. traceMetadata, explainPlan).

instrumentation.sp - now in Finish we have an assumption that sp is non-nil. This is always the case after Setup is called and returns needFinish == true, yet because of the resetting behavior in resetPlanner we would overwrite sp to nil. This was introduced in #61532 as a perf optimization.

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)


pkg/sql/conn_executor.go, line 2274 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

instrumentation.sp - now in Finish we have an assumption that sp is non-nil. This is always the case after Setup is called and returns needFinish == true, yet because of the resetting behavior in resetPlanner we would overwrite sp to nil. This was introduced in #61532 as a perf optimization.

Also because of the current resetting behavior tracing of PREPARE is effectively a noop, and I'm not sure it was the intention.

@yuzefovich
Copy link
Copy Markdown
Member Author

Updated the PR based on the conversation with Radu.

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @RaduBerinde, and @yuzefovich)


pkg/sql/conn_executor_prepare.go, line 204 at r3 (raw file):

		ex.statsCollector.reset(&ex.server.sqlStats, ex.appStats, &ex.phaseTimes)
		p := &ex.planner
		if origin != PreparedStatementOriginSQL {

nit: I think this probably deserves a "why" comment

Previously, when executing a PREPARE command, we would always reset the
planner. This is not necessary when the origin of the command is SQL
because in that case we properly reset the planner before starting the
statement's execution. This change fixes a problem when PREPARE
statements are chosen to be sampled for execution statistics (without
the fix, we would encounter a nil pointer).

Release note: None
Release note (sql change): Default value for `sql.txn_stats.sample_rate`
cluster setting has been increased from 0 to 0.1. This means that from
now on every statement has 10% probability of being sampled for the
purposes of execution statistics. Note that no other criteria for
sampling (like the query latencies) are currently being utilized to
decide whether to sample a statement or not.
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @RaduBerinde)


pkg/sql/conn_executor_prepare.go, line 204 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

nit: I think this probably deserves a "why" comment

Good point, done.

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @RaduBerinde)

@yuzefovich
Copy link
Copy Markdown
Member Author

The CI seems to be red because of unrelated flakes in KV, so I'll go ahead and merge this.

TFTRs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 10, 2021

Build succeeded:

@craig craig bot merged commit 7a26119 into cockroachdb:master Mar 10, 2021
@yuzefovich yuzefovich deleted the sample-rate branch March 10, 2021 22:50
@rafiss rafiss added this to the 21.1 milestone Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: evaluate sql.txn_stats.sample_rate default

5 participants