Skip to content

sql: fix the sql.trace.stmt.enable_threshold cluster setting #75739

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
andreimatei:tracing.fix-stmt-threshold
Feb 1, 2022
Merged

sql: fix the sql.trace.stmt.enable_threshold cluster setting #75739
craig[bot] merged 2 commits intocockroachdb:masterfrom
andreimatei:tracing.fix-stmt-threshold

Conversation

@andreimatei
Copy link
Copy Markdown
Contributor

This setting had rotted, leading to a span use-after-Finish.

This was caught by the recently-unskipped follower reads roachtests
(roachtests panic on use-after-finish), accounting for all the issues
below. The last entry in these issues refers to the nodes crashing
because of the bug fixed here. Some of these issues have been open for a
long time for more relevant failures - those have been fixed in #72296.

Fixes #75716
Fixes #75715
Fixes #75714
Fixes #71244
Fixes #71126
Fixes #70818
Fixes #70191
Fixes #70011
Fixes #70010
Fixes #69952
Fixes #69951

Release note: None

@andreimatei andreimatei requested a review from rafiss January 31, 2022 16:48
@andreimatei andreimatei requested a review from a team as a code owner January 31, 2022 16:48
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, just had small comments

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @rafiss)


pkg/sql/conn_executor_exec.go, line 722 at r1 (raw file):

		}

		ctx = origCtx

instead of needing to remember to reset back to origCtx (and add a new {} block), can we replace the whole block with

		spanCtx := ctx
		if !alreadyRecording && stmtTraceThreshold > 0 {
			spanCtx, stmtThresholdSpan = createRootOrChildSpan(ctx, "trace-stmt-threshold", ex.transitionCtx.tracer, tracing.WithRecording(tracing.RecordingVerbose))
		}

		if err := ex.dispatchToExecutionEngine(spanCtx, p, res); err != nil {
			stmtThresholdSpan.Finish()
			return nil, nil, err
		}


pkg/sql/trace_test.go, line 646 at r1 (raw file):

	ctx := context.Background()
	settings := cluster.MakeTestingClusterSettings()
	sql.TraceStmtThreshold.Override(ctx, &settings.SV, 1*time.Nanosecond)

if this test were in sql instead of sql_test, could we avoid exporting TraceStmtThreshold? (is it better to keep that private?)

@andreimatei andreimatei force-pushed the tracing.fix-stmt-threshold branch from e43fb09 to 095e3a5 Compare January 31, 2022 17:47
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jan 31, 2022

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Jan 31, 2022
Copy link
Copy Markdown
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/conn_executor_exec.go, line 722 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

instead of needing to remember to reset back to origCtx (and add a new {} block), can we replace the whole block with

		spanCtx := ctx
		if !alreadyRecording && stmtTraceThreshold > 0 {
			spanCtx, stmtThresholdSpan = createRootOrChildSpan(ctx, "trace-stmt-threshold", ex.transitionCtx.tracer, tracing.WithRecording(tracing.RecordingVerbose))
		}

		if err := ex.dispatchToExecutionEngine(spanCtx, p, res); err != nil {
			stmtThresholdSpan.Finish()
			return nil, nil, err
		}

good idea, done


pkg/sql/trace_test.go, line 646 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

if this test were in sql instead of sql_test, could we avoid exporting TraceStmtThreshold? (is it better to keep that private?)

I thought about doing that, but I think it's fine for cluster settings to be exported, to benefit tests that want to set them through this mechanism. Just like anyone can run the set cluster setting command, similarly I think anyone should be able to do what we're doing here.

@andreimatei andreimatei removed the O-community Originated from the community label Jan 31, 2022
Copy link
Copy Markdown
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @rafiss)


pkg/sql/conn_executor_exec.go, line 710 at r3 (raw file):

	alreadyRecording := ex.transitionCtx.sessionTracing.Enabled()
	stmtTraceThreshold := TraceStmtThreshold.Get(&ex.planner.execCfg.Settings.SV)
	var stmtCtx context.Context

could initialize as stmtCtx := ctx to avoid the hanging else


pkg/sql/trace_test.go, line 646 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I thought about doing that, but I think it's fine for cluster settings to be exported, to benefit tests that want to set them through this mechanism. Just like anyone can run the set cluster setting command, similarly I think anyone should be able to do what we're doing here.

makes sense

This setting had rotted, leading to a span use-after-Finish.

This was caught by the recently-unskipped follower reads roachtests
(roachtests panic on use-after-finish), accounting for all the issues
below. The last entry in these issues refers to the nodes crashing
because of the bug fixed here. Some of these issues have been open for a
long time for more relevant failures - those have been fixed in cockroachdb#72296.

Fixes cockroachdb#75716
Fixes cockroachdb#75715
Fixes cockroachdb#75714
Fixes cockroachdb#71244
Fixes cockroachdb#71126
Fixes cockroachdb#70818
Fixes cockroachdb#70191
Fixes cockroachdb#70011
Fixes cockroachdb#70010
Fixes cockroachdb#69952
Fixes cockroachdb#69951

Release note: None
@andreimatei andreimatei force-pushed the tracing.fix-stmt-threshold branch from 095e3a5 to c8069ad Compare January 31, 2022 22:02
Copy link
Copy Markdown
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @rafiss)


pkg/sql/conn_executor_exec.go, line 710 at r3 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

could initialize as stmtCtx := ctx to avoid the hanging else

in this case, I preferred this more explicit way of writing it. By not assigning an initial value to stmtCtx, the code suggests that something funky. Whenever mucking with contexts, I found it good to highlight it.

Copy link
Copy Markdown
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @rafiss)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 1, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 1, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 1, 2022

Build succeeded:

@craig craig bot merged commit 19a5d53 into cockroachdb:master Feb 1, 2022
@andreimatei andreimatei deleted the tracing.fix-stmt-threshold branch February 8, 2022 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment