Skip to content

util/tracing,sql: add builtin to set trace spans' verbosity#61353

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
angelapwen:add-verbosity-toggle-builtin
Mar 6, 2021
Merged

util/tracing,sql: add builtin to set trace spans' verbosity#61353
craig[bot] merged 1 commit intocockroachdb:masterfrom
angelapwen:add-verbosity-toggle-builtin

Conversation

@angelapwen
Copy link
Copy Markdown

@angelapwen angelapwen commented Mar 2, 2021

Resolves #61180.

Previously there was no way to change a span's verbosity via the
SQL shell. We want to be able to set a specific long-running
span's verbosity on to retrieve its recordings. This patch adds a
builtin, crdb_internal.set_trace_verbose that takes in a
trace ID and a bool representing verbose or not
verbose. It sets the verbosity of all spans in this trace.

Note that we would prefer to toggle individual span verbosity,
but this would require a registry of Span objects that
is not added to the 21.1 release. If this Span registry were added
in the future, we could access a Span given its span ID.

Release justification: Adds a crdb_internal tool meant for on-call
engineers, TSEs, etc to debug.

Release note (sql change): Adds a new builtin that sets the verbosity
of all spans in a given trace. Syntax:
crdb_internal.set_trace_verbose($traceID,$verbosityAsBool).

@angelapwen angelapwen self-assigned this Mar 2, 2021
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch 2 times, most recently from 314412d to 0ff746d Compare March 2, 2021 18:04
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 7 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angelapwen, @irfansharif, @knz, and @tbg)


pkg/sql/tests/tracing_sql_test.go, line 38 at r1 (raw file):

Quoted 5 lines of code…
r.Exec(
		t,
		"SELECT * FROM crdb_internal.set_verbosity_of_root_span_and_descendants($1, true)",
		rootID,
	)

It would be nice to check the return value here, both for a hit and a miss (we sort of do this in the SQL logic test, but only for the negative case).

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 0ff746d to eb11088 Compare March 2, 2021 18:38
@angelapwen
Copy link
Copy Markdown
Author


pkg/sql/tests/tracing_sql_test.go, line 38 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…
r.Exec(
		t,
		"SELECT * FROM crdb_internal.set_verbosity_of_root_span_and_descendants($1, true)",
		rootID,
	)

It would be nice to check the return value here, both for a hit and a miss (we sort of do this in the SQL logic test, but only for the negative case).

Done. Thanks for the suggestion! It looked like SQLRunner.Exec didn't surface the right return value (just LastInsertId and RowsAffected) so played around with SQLRunner.CheckQueryResults which does what we want! 😺

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from eb11088 to 3ee0d2d Compare March 2, 2021 18:43
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @angelapwen, @irfansharif, @knz, and @tbg)


pkg/sql/tests/tracing_sql_test.go, line 38 at r1 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

Done. Thanks for the suggestion! It looked like SQLRunner.Exec didn't surface the right return value (just LastInsertId and RowsAffected) so played around with SQLRunner.CheckQueryResults which does what we want! 😺

Awesome, thanks!

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch 2 times, most recently from 26b6e18 to 1e3e468 Compare March 2, 2021 21:11
Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @angelapwen, @erikgrinaker, @knz, and @tbg)


pkg/sql/sem/builtins/builtins.go, line 3638 at r2 (raw file):

	),

	// TODO(angelapwen): Does it make more sense for this builtin to take a trace ID?

It does, and we should do it. It avoids the UX clarification we're making here around asking the user to specify the root span ID rather than a span ID. The mapping from a trace ID to span ID is a bit manual, and the DB is well suited to do it directly. Especially given the vtable schema associates a trace ID per span. The way we have it now, we're asking the user to walk up the list of parent span IDs to get to the root, which is unnecessary.


pkg/sql/sem/builtins/builtins.go, line 3647 at r2 (raw file):

				{"verbosity", types.Bool},
			},
			ReturnType: tree.FixedReturnType(types.Bool),

Is an erroneous call (for a missing span/trace) best represented by a false return? Can we not do better for built ins? (like, erroring out informatively?)


pkg/util/tracing/tracer.go, line 670 at r2 (raw file):

func (t *Tracer) setAllChildrenSpanVerbosity(rootCrdbSpan *crdbSpan, to bool) {
	// TODO(angelapwen): Do we need to check that root is not a noop the way we do in `SetVerbose`?
	for _, child := range rootCrdbSpan.mu.recording.children {

We're not grabbing the right locks here; this is unsafe (for when children are added and removed from the parent span's list).

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 1e3e468 to 002b1a1 Compare March 2, 2021 21:38
@blathers-crl blathers-crl bot requested a review from irfansharif March 2, 2021 21:38
Copy link
Copy Markdown
Author

@angelapwen angelapwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @angelapwen, @erikgrinaker, @irfansharif, @knz, and @tbg)


pkg/sql/sem/builtins/builtins.go, line 3638 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

It does, and we should do it. It avoids the UX clarification we're making here around asking the user to specify the root span ID rather than a span ID. The mapping from a trace ID to span ID is a bit manual, and the DB is well suited to do it directly. Especially given the vtable schema associates a trace ID per span. The way we have it now, we're asking the user to walk up the list of parent span IDs to get to the root, which is unnecessary.

👍 that makes sense. I'll use this PR as a starting point and modify to take in a trace ID.

I'm thinking about how I would surface a particular trace ID via the test, to pass it in. Perhaps using the crdb_internal.trace_id builtin would surface the correct trace, but I'm not 100% sure on that.


pkg/sql/sem/builtins/builtins.go, line 3647 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Is an erroneous call (for a missing span/trace) best represented by a false return? Can we not do better for built ins? (like, erroring out informatively?)

Yes, we can throw a pgerror. I remembered though that @knz mentioned in the trace_id builtin PR that when there was no active trace span, we should return null instead of throwing an error so the transaction could continue link to comment and so I thought this would be the equivalent.

It is a bit more confusing as the return value is a boolean, so I cannot return null (only true or false) — I meant the boolean to represent "success" which I believe is a common pattern.


pkg/util/tracing/tracer.go, line 670 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

We're not grabbing the right locks here; this is unsafe (for when children are added and removed from the parent span's list).

Done, thanks for catching this!

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 002b1a1 to 4f478cc Compare March 3, 2021 10:41
Copy link
Copy Markdown
Author

@angelapwen angelapwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @angelapwen, @erikgrinaker, @irfansharif, @knz, and @tbg)


pkg/sql/sem/builtins/builtins.go, line 3638 at r2 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

👍 that makes sense. I'll use this PR as a starting point and modify to take in a trace ID.

I'm thinking about how I would surface a particular trace ID via the test, to pass it in. Perhaps using the crdb_internal.trace_id builtin would surface the correct trace, but I'm not 100% sure on that.

Done. The builtin now takes a trace ID and queries for the root span ID internally.


pkg/util/tracing/tracer.go, line 670 at r2 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

Done, thanks for catching this!

I just realized that adding the locks here resulted in deadlock because child.enableRecording also calls parent.addChild which acquires the lock on the parent span, and that's the span that's locked here.

I don't think we can lock the parent here as is considering we are doing two conflicting things: iterating through its children list and in each iteration, potentially adding a child to the children list.

(As a side note, is there something wrong with the logic here? It seems that for every child in the children list, we are calling enableRecording with a non-nil parent, and the child is redundantly appended to the parent's children list. I think I need to add a check here that only calls enableRecording when verbosity was false originally, and only calls disableRecording() if it was originally true.

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 4f478cc to edf0b5d Compare March 3, 2021 10:59
@angelapwen
Copy link
Copy Markdown
Author


pkg/util/tracing/tracer.go, line 670 at r2 (raw file):

Previously, angelapwen (Angela P Wen) wrote…

I just realized that adding the locks here resulted in deadlock because child.enableRecording also calls parent.addChild which acquires the lock on the parent span, and that's the span that's locked here.

I don't think we can lock the parent here as is considering we are doing two conflicting things: iterating through its children list and in each iteration, potentially adding a child to the children list.

(As a side note, is there something wrong with the logic here? It seems that for every child in the children list, we are calling enableRecording with a non-nil parent, and the child is redundantly appended to the parent's children list. I think I need to add a check here that only calls enableRecording when verbosity was false originally, and only calls disableRecording() if it was originally true.

Ok, I think I have worked this out now — indeed because the child is already recorded as part of this parent, we should pass in nil rather than the parent span in enableRecording — this prevents the child from being redundantly appended, and also prevents the deadlock because the parent's lock no longer needs to be acquired here! 🧠 💥

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch 3 times, most recently from 856cd02 to 12ae9f3 Compare March 3, 2021 11:15
@angelapwen angelapwen changed the title util/tracing,sql: add builtin to toggle span verbosity util/tracing,sql: add builtin to set trace spans' verbosity Mar 3, 2021
@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch 3 times, most recently from 8045dc3 to a640ce1 Compare March 3, 2021 15:58
@angelapwen
Copy link
Copy Markdown
Author


pkg/util/tracing/tracer.go, line 669 at r2 (raw file):

// children.
func (t *Tracer) setAllChildrenSpanVerbosity(rootCrdbSpan *crdbSpan, to bool) {
	// TODO(angelapwen): Do we need to check that root is not a noop the way we do in `SetVerbose`?

I convinced myself that we do not need to check whether the root or the children crdbSpans here are no-ops. If a span has a crdbSpan, it is a real span (and only real spans are recorded in the children list of the recording).

On the other hand, there is a concern around whether the real root span (passed into the SetAllDescendantsSpanVerbosity) is a no-op span. If it is, then rootCrdbSpan may be nil and acquiring its mutex here would crash. So I've added a check as to whether that rootCrdbSpan is nil in SetAllDescendantsSpanVerbosity and am returning false to the builtin if it is.

@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 1778358 to 2f7c05a Compare March 5, 2021 11:33
@blathers-crl blathers-crl bot requested a review from irfansharif March 5, 2021 11:33
@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch 3 times, most recently from 6a97c1a to 31f0227 Compare March 5, 2021 12:38
Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I think you'll need a @tbg stamp given we're in stability.

Previously there was no way to change a span's verbosity via the
SQL shell. We want to be able to set a specific long-running
span's verbosity on to retrieve its recordings. This patch adds a
builtin, `crdb_internal.set_trace_verbose` that takes in a
trace ID and a bool representing verbose or not
verbose. It sets the verbosity of all spans in this trace.

Note that we would prefer to toggle individual span verbosity,
but this would require a registry of Span objects that
is not added to the 21.1 release. If this Span registry were added
in the future, we could access a Span given its span ID.

Release justification: Adds a crdb_internal tool meant for on-call
engineers, TSEs, etc to debug.

Release note (sql change): Adds a new builtin that sets the verbosity
of all spans in a given trace. Syntax:
crdb_internal.set_trace_verbose($traceID,$verbosityAsBool).
@angelapwen angelapwen force-pushed the add-verbosity-toggle-builtin branch from 31f0227 to d4f9b02 Compare March 5, 2021 16:53
@angelapwen
Copy link
Copy Markdown
Author

bors r=irfansharif,erikgrinaker

@irfansharif
Copy link
Copy Markdown
Contributor

@angelapwen, because of the new span inner file, this will merge conflict with #61359. I'd recommend rebasing on top of that, or just waiting for that to merge first. Otherwise the entire batch will get cancelled. You'll hate me for this, but I'll cancel the bors batch for now.

bors r-

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 5, 2021

Canceled.

@irfansharif
Copy link
Copy Markdown
Contributor

(to make up for it I'm happy to rebase it forward for you - enjoy your weekend!)

@irfansharif irfansharif self-assigned this Mar 5, 2021
@angelapwen
Copy link
Copy Markdown
Author

@irfansharif Hahaha — I'm happy to rebase it when the other PR has merged tomorrow too. Thanks for the head's up 😸

@irfansharif
Copy link
Copy Markdown
Contributor

(you should be good to go)

@angelapwen
Copy link
Copy Markdown
Author

bors r=irfansharif,erikgrinaker

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 6, 2021

Build succeeded:

@craig craig bot merged commit 15a1856 into cockroachdb:master Mar 6, 2021
@angelapwen angelapwen deleted the add-verbosity-toggle-builtin branch March 6, 2021 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: add builtin to toggle span verbosity

5 participants