Skip to content

log,debug: fix logspy, new HTTP API to change vmodule#61378

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
miretskiy:log_intercept
Jun 10, 2021
Merged

log,debug: fix logspy, new HTTP API to change vmodule#61378
craig[bot] merged 2 commits intocockroachdb:masterfrom
miretskiy:log_intercept

Conversation

@miretskiy
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy commented Mar 2, 2021

Fixes #61335.
First 3 commits from #66096.

tldr: the logspy endpoint does not enable maximum verbosity any more, and now returns JSON entries. Use the new vmodule endpoint to configure verbosity if needed.

See the commit messages of the last 2 commits for details and rationales.

Relevant user-facing changes:

Release note (security update): All the logging output to files
or network sinks was previously disabled temporarily while an operator
was using the /debug/logspy HTTP API, resulting in lost entries
and a breach of auditability guarantees. This behavior has been corrected.

Release note (bug fix): Log entries are not lost any more while the
/debug/logspy HTTP API is being used. This bug had existed since
CockroachDB v1.1.

Release note (api change): The /debug/logspy HTTP API has changed.
The endpoint now returns JSON data by default.
This change is motivated as follows:

  • the previous format, crdb-v1, cannot be parsed reliably.
  • using JSON entries guarantees that the text of each entry
    all fits on a single line of output (newline characters
    inside the messages are escaped). This makes filtering
    easier and more reliable.
  • using JSON enables the user to apply jq on the output, for
    example via curl -s .../debug/logspy | jq ...

If the previous format is desired, the user can pass the query
argument &flatten=1 to the logspy URL to obtain the previous flat
text format (crdb-v1) instead.

Release note (api change): The /debug/logspy API does not any more
enable maximum logging verbosity automatically. To change the
verbosity, use the new /debug/vmodule endpoint or pass the
&vmodule= query parameter to the /debug/logspy endpoint.

For example, suppose you wish to run a 20s logspy session:

  • Before:

    curl 'https://.../debug/logspy?duration=20s&...'
    
  • Now:

    curl 'https://.../debug/logspy?duration=20s&vmodule=...'
    

    OR

    curl 'https://.../debug/vmodule?duration=22s&vmodule=...'
    curl 'https://.../debug/logspy?duration=20s'
    

As for the regular vmodule command-line flag, the maximum verbosity
across all the source code can be selected with the pattern *=4.

Note: at most one in-flight HTTP API request is allowed to modify the
vmodule parameter. This maintain the invariant that the
configuration restored at the end of each request is the same as when
the request started.

Release note (api change): The new /debug/vmodule API makes it
possible for an operator to configure the logging verbosity in a
similar way as the SQL built-in function
crdb_internal.set_vmodule(), or to query the current configuration
as in crdb_internal.get_vmodule(). Additionally, any configuration
change performed via this API can be automatically reverted after a
configurable delay. The API forms are:

  • /debug/vmodule - retrieve the current configuration.
  • /debug/vmodule?set=[vmodule config]&duration=[duration] - change
    the configuration to [vmodule config] . The previous configuration
    at the time the /debug/vmodule request started is restored after
    [duration]. This duration, if not specified, defaults to twice the
    default duration of a logspy request (currently, the logspy
    default duration is 5s, so the vmodule default duration is 10s).
    If the duration is zero or negative, the previous configuration
    is never restored.

@miretskiy miretskiy requested a review from knz March 2, 2021 23:54
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@miretskiy miretskiy marked this pull request as draft March 2, 2021 23:54
Copy link
Copy Markdown
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! I left some comments from my initial read through.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @miretskiy)


pkg/util/log/intercept.go, line 59 at r1 (raw file):

type InterceptMessageFn func(message []byte)

// interceptorsPtr stores a pointer to the list of currently active interceptors.

Should this comment be on the line below?


pkg/util/log/intercept.go, line 72 at r1 (raw file):

// interceptorID returns the ID for the specified interceptor.
func interceptorID(interceptor *InterceptMessageFn) InterceptorID {
	return InterceptorID(unsafe.Pointer(interceptor))

I will likely expose some ignorance of Go internals here, but if I'm reading the documentation right, comparing the uintptr's works today for heap-allocated objects but isn't guaranteed by the language spec (and I think wouldn't work for stack-allocated objects: https://play.golang.org/p/0_f3MAp96JL).

Not recommending a change, just talking through my understanding. I believe this will work on our current version of Go.


pkg/util/log/intercept.go, line 99 at r1 (raw file):

	}
	if yank == -1 {
		panic("cannot remove interceptor")

Might be worth a comment on why we want this to be a hard panic vs a no-op (perhaps just returning l)?


pkg/util/log/intercept.go, line 129 at r1 (raw file):

			newInterceptorsPtr = &newInterceptors
		}
		if atomic.CompareAndSwapPointer(&interceptorsPtr, addr, unsafe.Pointer(newInterceptorsPtr)) {

Question: Would it be worth a comment on why a mutex wouldn't suffice here?

@miretskiy miretskiy requested a review from stevendanna March 3, 2021 18:04
Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz, @miretskiy, and @stevendanna)


pkg/util/log/intercept.go, line 72 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

I will likely expose some ignorance of Go internals here, but if I'm reading the documentation right, comparing the uintptr's works today for heap-allocated objects but isn't guaranteed by the language spec (and I think wouldn't work for stack-allocated objects: https://play.golang.org/p/0_f3MAp96JL).

Not recommending a change, just talking through my understanding. I believe this will work on our current version of Go.

It's unlikely you're ignorant in Go -- I'd say I am... I was looking for weak_ptr in Go -- that kinda shows the level of ignorance I have :)

So, this particular code, I was going back and forth, at least 4 or 5 times...
There are few things at play here:

  1. (and that's the most important) Interceptors should have "id" associated with them so that
    I have tried using a pointer itself as an ID, but found it not very satisfying.
  2. To support add/remove as a pointer value, I have to pass an implementation of a (now deleted) LogMessageInterceptor interface. But interface comparisons are tricky. In particular, you could have different
    interceptors, but as long as their underlying implementation is the same (i.e. empty struct), then interface value comparison would not work.
  3. I felt (not strongly) that having an interface might be an overkill, where a function would work (note, you cannot compare functions).

So, taking these together, i'm back to my original desire: I need a unique ID for an interceptor.
One way: just maybe use time, or some such, and redo things in a loop when adding interceptor. And maybe, that's a good way to go.

The above works (and maybe it's too smart for its own good) because to add an interceptor, you pass in an InterceptMessageFn (note, not a pointer) by value to the AddInterceptor method. Thereafter, I take the address of that stack allocated parameter, and If I understand correctly, that parameter "escapes to heap" (because I will save it inside my interceptorList). As long as that object is in use, nothing should be stored at that memory address -- so I think taking uintptr address of that value is okay.

But then, maybe I'm entirely wrong...


pkg/util/log/intercept.go, line 129 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Question: Would it be worth a comment on why a mutex wouldn't suffice here?

Having mutex means you have to lock to check if you have interceptors. We don't want to lock every time you log.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish to review this more in depth but note that I am on PTO at this moment. So please give me some time to process this.

My initial remark though, in response to the ongoing conversation in the review thread, is that you made this overly complicated. IMHO you could have written type InterceptorID *interceptorSink and that would have made your API simpler. We'll discuss more when I have a chance to scrutinize this further.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @stevendanna)

@miretskiy
Copy link
Copy Markdown
Contributor Author

miretskiy commented Mar 3, 2021 via email

@miretskiy miretskiy requested a review from knz March 5, 2021 00:23
Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)


pkg/util/log/intercept.go, line 59 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Should this comment be on the line below?

obsolete.


pkg/util/log/intercept.go, line 72 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

It's unlikely you're ignorant in Go -- I'd say I am... I was looking for weak_ptr in Go -- that kinda shows the level of ignorance I have :)

So, this particular code, I was going back and forth, at least 4 or 5 times...
There are few things at play here:

  1. (and that's the most important) Interceptors should have "id" associated with them so that
    I have tried using a pointer itself as an ID, but found it not very satisfying.
  2. To support add/remove as a pointer value, I have to pass an implementation of a (now deleted) LogMessageInterceptor interface. But interface comparisons are tricky. In particular, you could have different
    interceptors, but as long as their underlying implementation is the same (i.e. empty struct), then interface value comparison would not work.
  3. I felt (not strongly) that having an interface might be an overkill, where a function would work (note, you cannot compare functions).

So, taking these together, i'm back to my original desire: I need a unique ID for an interceptor.
One way: just maybe use time, or some such, and redo things in a loop when adding interceptor. And maybe, that's a good way to go.

The above works (and maybe it's too smart for its own good) because to add an interceptor, you pass in an InterceptMessageFn (note, not a pointer) by value to the AddInterceptor method. Thereafter, I take the address of that stack allocated parameter, and If I understand correctly, that parameter "escapes to heap" (because I will save it inside my interceptorList). As long as that object is in use, nothing should be stored at that memory address -- so I think taking uintptr address of that value is okay.

But then, maybe I'm entirely wrong...

Alright -- I think it was trying to be too fancy, as pointed out by @stevendanna and @knz
Reworked to something which I think is A LOT simpler.


pkg/util/log/intercept.go, line 99 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Might be worth a comment on why we want this to be a hard panic vs a no-op (perhaps just returning l)?

Obsolete.


pkg/util/log/intercept.go, line 129 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Having mutex means you have to lock to check if you have interceptors. We don't want to lock every time you log.

This is now obsolete.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have given this PR a deeper look.

The main contention I have with the current approach is that it incurs the cost of an extra sink on all channels even when there is no interceptor currently defined. This is surprising and arguably undesirable.

In my opinion, we should add a sink only when a test requires interception and remove it when the interception is not needed any more. We should also do 1 interception per sink. So if there are 10 interceptors registered, that would be 10 different sinks.

Meanwhile, I would also like to point out that we do not need to intercept all channels.
The tests that need an interceptor only need to intercept 1 channel.
The logspy feature only needs to intercept the DEV channel.

So the intercept API should take a channel argument and only install an interceptor sink for that channel.

For example I am thinking of the following API:

package log

// RegisterInterceptor registers the given interceptFn on the specificed channel.
// The log entries are formatted using the format specified in desiredFormat
// before they are passed to the intercept fn.
// The interceptor is set up as a sink with no severity filter, and with redaction markers enabled.
//
// The caller is responsible for calling the returned cleanup fn when the interceptor is
// not needed any more, to de-register the sink.
func RegisterInterceptor(ch channels.Channel,  desiredFormat string, interceptFn func([]byte))  (cleanup func())

Reviewed 2 of 4 files at r1, 1 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)

@miretskiy miretskiy requested a review from knz March 24, 2021 14:01
Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The main contention I have with the current approach is that it incurs the cost of an extra sink on all channels even when there is no interceptor currently defined. This is surprising and arguably undesirable."

I'm not sure the cost is larger in any meaningful way than it is today. We check for atomic variable when calling outputLogEntry; as opposed to this implementation, which, when iterating over configured sinks checks that interceptor is not active (by loading atomic variable) -- are you concerned over the cost of 1 more sink in the loop?

I like the suggestion re adding sink per channel (which makes above moot, btw). Before I do that, I wanted to ask your opinion on the implementation details... loggerT contains the list of sinkInfos -- and presumably, if the intercept is active, that's where I would need to add interceptor sink. However, the loggerT is initialized at the startup and never modified.
I don't love the idea of adding any locks there.

I could use the loggingT rmu lock, which protects channel->*loggerT map, but, the comment on that lock seems to
imply it's a mis-design and should be removed.

Finally, I could change loggerT itself to have an atomic ptr instead of []sinkInfos, and use atomic compare/swap when I want to add/remove interceptor sink(s).

Finally, while I can certainly use func([]byte) as interceptor, and I did so in the earlier incarnations, is it much better
vs using an interface? Interface could be extended -- for example to support redaction for example.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)

@knz
Copy link
Copy Markdown
Contributor

knz commented Mar 25, 2021

the loggerT is initialized at the startup and never modified. I don't love the idea of adding any locks there.

Very good concern and very good finding!

Finally, I could change loggerT itself to have an atomic ptr instead of []sinkInfos, and use atomic compare/swap when I want to add/remove interceptor sink(s).

More specifically, you mean using a proper linked list, with a next pointer, and use atomic pointer swaps to add and remove items to the list? Yes, that would be a good idea.

Note: I have added some slack in our 21.2 release planning to improve this area. So don't feel pressured by time to complete everything in-between other projects; we can make it a proper milestone achievement, for example in April or May.

Finally, while I can certainly use func([]byte) as interceptor, and I did so in the earlier incarnations, is it much better vs using an interface? Interface could be extended -- for example to support redaction for example.

Yes that seems reasonable - but what interface would be suitable? To make a good choice, we'd need to review existing usage of the interceptor in unit tests to see what is actually needed.

Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knz
Just pushed new version which uses atomics to manage sink infos w/out locks.
A bit more complex than before.

Take a look -- let me know what you think. I'm not sure about the Config() method I added
to the interceptor and whether it's a good idea.

I'm almost inclined to use default interceptor configuration that will redact by default
(and make LogMessageInterceptor not configurable). On the other hand, maybe there are uses
for non-redacted interceptors. WDYT?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)

Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)


pkg/util/log/intercept.go, line 72 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Alright -- I think it was trying to be too fancy, as pointed out by @stevendanna and @knz
Reworked to something which I think is A LOT simpler.

Done.

Copy link
Copy Markdown
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)


pkg/util/log/intercept.go, line 72 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Done.

Done.

@miretskiy miretskiy marked this pull request as ready for review April 22, 2021 16:10
@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
@knz knz removed the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
@knz
Copy link
Copy Markdown
Contributor

knz commented May 6, 2021

ok so time is coming to resuscitate this work. I'll be doing a new pass of reviews.

@miretskiy
Copy link
Copy Markdown
Contributor Author

Thanks, @knz. Just pushed freshly rebased version (and fixed lint warning).

@miretskiy
Copy link
Copy Markdown
Contributor Author

@knz any updates on this?

@knz knz force-pushed the log_intercept branch from 2c90d2e to 3ae50e0 Compare June 5, 2021 16:20
@knz knz changed the title logs: Introduce new interceptor API. log,debug: fix logspy, new HTTP API to change vmodule Jun 5, 2021
@knz
Copy link
Copy Markdown
Contributor

knz commented Jun 5, 2021

Hi Yevgueniy, I have rebased your PR on top of #66096, made a few changes and added another commit.

Feel free to have a look, and perhaps refresh your local copy of the branch from github.

Can you and/or Steven review these changes (specifically, in the light of the new and updated commit messages). Then we can merge this.

@knz knz force-pushed the log_intercept branch 2 times, most recently from 3327c17 to 45b3d7b Compare June 5, 2021 16:54
@tbg
Copy link
Copy Markdown
Member

tbg commented Jun 7, 2021

Thank you for working on this! I haven't reviewed this in depth, but there is a UX problem here that I don't think has been discussed, and it's the split between vmodule and logspy. It's generally great that we can now logspy without fully turning on the verbose firehose. However, the UX here will make it very hard to debug in practice. It's not great, but we often do end up on a screenshare and ask a customer to open a certain logspy page on their browser. Asking them to also open parallel tabs that set the right vmodule is going to degrade the debugging experience during customer issues. I would really like to avoid that. The separate vmodule endpoint is fine, but it would be a great help if vmodule could be specified within a logspy invocation as well. I think the version I would use the most is some variation of

/foo/logspy?flatten=1&vmodule=*=1&count=1000&duration=10s&grep=.

The logspy endpoint is also currently self-documenting (if you call it without parameters, you get one line showing you the parameters). This should remain true, or the link on the debug page in the ui should include all parameters (maybe it already does, as I mentioned, haven't reviewed the code yet).

The commit message could specify how overlapping vmodule requests are handled. (Overlapping logspy invocations, in the old world, would return an error, similar to what CPU profiling does when one is already active). Neither really matter, but it's good to document them.

craig bot pushed a commit that referenced this pull request Jun 7, 2021
65342:  multi-region syntax diagrams r=ericharmeling a=ericharmeling

- Updated ALTER/CREATE DATABASE diagrams for multi-region
- Updated CREATE TABLE diagram for multi-region
- Added opt_locality diagram

Release note: None

66096: util/log: extend `logpb.Entry` with more fine grained fields r=cameronnunez a=knz

Needed for #65633 and #61378.

cc @cockroachdb/server 

66142: cli: skip under testrace as this test might be timing out r=jlinder a=adityamaru

Release note: None

Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Aditya Maru <adityamaru@gmail.com>
@knz knz force-pushed the log_intercept branch 2 times, most recently from 1bcc587 to 4f7769d Compare June 8, 2021 16:56
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be a great help if vmodule could be specified within a logspy invocation as well

Good idea. Done. (The way you suggested.)

The logspy endpoint is also currently self-documenting (if you call it without parameters, you get one line showing you the parameters)

It shows you a representation of the Go struct, not the HTTP query parameters. I don't think that's a big deal, because...

the link on the debug page in the ui should include all parameters (maybe it already does)

it does.

The commit message could specify how overlapping vmodule requests are handled.

Done:

Note: at most one in-flight HTTP API request is allowed to modify the
vmodule parameter. This maintain the invariant that the
configuration restored at the end of each request is the same as when
the request started.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)

@knz knz force-pushed the log_intercept branch from 4f7769d to 0ca8cea Compare June 8, 2021 18:07
@andreimatei
Copy link
Copy Markdown
Contributor

it would be a great help if vmodule could be specified within a logspy invocation as well

Please make sure that the logspy link in the UI, or otherwise the logspy default, is to enable all the log messages. I've enjoyed the fact that logspy used to grep through everything; the less I need to tell anyone to type, and the less confusion about the parameters, the better.

@tbg
Copy link
Copy Markdown
Member

tbg commented Jun 9, 2021

Was just testing this out, the debug page prints:

/debug/logspy?count=[count]&duration=[duration]&grep=[regexp]&vmodule=[vmodule]

I think this should also list &flatten=[bool]

Also, the actual link is too vanilla: http://td:8080/debug/logspy?count=1&duration=10s&grep=.

Could all the parameters be populated in the link? That way, it's easier to click through and adjust rather than always having to remember the description. I think a sane default would be

http://td:8080/debug/logspy?flatten=1&count=100&duration=10s&vmodule=*=0&grep=.

Overall this is great though! Very happy about this.

knz and others added 2 commits June 10, 2021 19:56
Prior to this patch, the logspy mechanism was utterly broken: any time
it was running, it would cut off any and all log entries from going to
files, stderr, network sinks etc. This was a gross violation
of the guarantees we have constructed around structured logging,
as well as a security vulnerability (see release note below).

Additionally, it was impossible to launch multiple logspy sessions
simultaneously on the same node, for example using different grep
filters.

This commit rectifies the situation.

At a technical level, we have two changes: one in the logging Go API
and one in the `/debug/logspy` HTTP API.

**For the logging changes**, this patch replaces the "interceptor" singleton
callback function that takes over `(*loggerT).outputLogEntry()`, by a
new *interceptor sink* that lives alongside the other sinks on every
channel.

Because the interceptor logic is now a regular sink, log entries
continue to be delivered to other sinks while an interceptor is
active.

Reusing the sink abstraction, with its own sink configuration with no
severity filter, clarifies that the interceptor accepts all the log
entries regardless of which filters are configured on other sinks.

Additionally, the interceptor sink now supports multiple concurrent
interception sessions. Each log entry is delivered to all current
interceptors. The `logspy` logic is adapted to use this facility,
so that multiple `logspy` requests can run side-by-side.

**For the HTTP API change**, we are changing the `/debug/logspy`
semantics. This is explained in the release note below.

Release note (security update): All the logging output to files
or network sinks was previously disabled temporarily while an operator
was using the `/debug/logspy` HTTP API, resulting in lost entries
and a breach of auditability guarantees. This behavior has been corrected.

Release note (bug fix): Log entries are not lost any more while the
`/debug/logspy` HTTP API is being used. This bug had existed since
CockroachDB v1.1.

Release note (api change): The `/debug/logspy` HTTP API has changed.
The endpoint now returns JSON data by default.
This change is motivated as follows:

- the previous format, `crdb-v1`, cannot be parsed reliably.
- using JSON entries guarantees that the text of each entry
  all fits on a single line of output (newline characters
  inside the messages are escaped). This makes filtering
  easier and more reliable.
- using JSON enables the user to apply `jq` on the output, for
  example via `curl -s .../debug/logspy | jq ...`

If the previous format is desired, the user can pass the query
argument `&flatten=1` to the `logspy` URL to obtain the previous flat
text format (`crdb-v1`) instead.

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
Prior to this patch, any ongoing `/debug/logspy` query would
trigger maximum verbosity in the logging package - i.e.
cause all logging API calls under `log.V` to be activated.

This was problematic, as it would cause a large amount
of logging traffic to be pumped through the interceptor logic,
increasing the chance for entries to be dropped (especially
when `logspy?grep=...` is not used).

Additionally, since the previous change to make the interceptor logic
a regular sink, all the entries captured by the interceptors are now
also duplicated to the other sinks (this is a feature / bug fix,
as explained in the previous commit message).

However, this change also meant that any ongoing `logspy` request
would cause maximum verbosity to all the sinks with threshold
INFO (most logging calls under `log.V` have severity INFO). For
example, the DEV channel accepts all entries at level INFO or higher
in the default config. This in turn could incur unacceptable disk
space or IOPS consumption in certain deployments.

In orde to mitigate this new problem, this patch removes the special
conditional from the logging package. From this point forward,
the verbosity of the entries delivered via `/debug/logspy` are those
configured via `vmodule` -- no more and no less.

To make this configurable for a running server, including one where
the SQL endpoint may not be available yet, this patch also introduces
a new `/debug/vmodule` HTTP API and extends `/debug/logspy` with
the ability to temporarily change `vmodule` *upon explicit request*.

Release note (api change): The `/debug/logspy` API does not any more
enable maximum logging verbosity automatically. To change the
verbosity, use the new `/debug/vmodule` endpoint or pass the
`&vmodule=` query parameter to the `/debug/logspy` endpoint.

For example, suppose you wish to run a 20s logspy session:

- Before:

  ```
  curl 'https://.../debug/logspy?duration=20s&...'
  ```

- Now:

  ```
  curl 'https://.../debug/logspy?duration=20s&vmodule=...'
  ```

  OR

  ```
  curl 'https://.../debug/vmodule?duration=22s&vmodule=...'
  curl 'https://.../debug/logspy?duration=20s'
  ```

As for the regular `vmodule` command-line flag, the maximum verbosity
across all the source code can be selected with the pattern `*=4`.

Note: at most one in-flight HTTP API request is allowed to modify the
`vmodule` parameter. This maintain the invariant that the
configuration restored at the end of each request is the same as when
the request started.

Release note (api change): The new `/debug/vmodule` API makes it
possible for an operator to configure the logging verbosity in a
similar way as the SQL built-in function
`crdb_internal.set_vmodule()`, or to query the current configuration
as in `crdb_internal.get_vmodule()`. Additionally, any configuration
change performed via this API can be automatically reverted after a
configurable delay. The API forms are:

- `/debug/vmodule` - retrieve the current configuration.
- `/debug/vmodule?set=[vmodule config]&duration=[duration]` - change
  the configuration to `[vmodule config]` . The previous configuration
  at the time the `/debug/vmodule` request started is restored after
  `[duration]`. This duration, if not specified, defaults to twice the
  default duration of a `logspy` request (currently, the `logspy`
  default duration is 5s, so the `vmodule` default duration is 10s).
  If the duration is zero or negative, the previous configuration
  is never restored.
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just testing this out, the debug page prints:
/debug/logspy?count=[count]&duration=[duration]&grep=[regexp]&vmodule=[vmodule]
I think this should also list &flatten=[bool]

Done.

Also, the actual link is too vanilla: http://td:8080/debug/logspy?count=1&duration=10s&grep=.
Could all the parameters be populated in the link? That way, it's easier to click through and adjust rather than always having to remember the description.

Done.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @stevendanna)

@knz knz force-pushed the log_intercept branch from 0ca8cea to 1c937f5 Compare June 10, 2021 18:33
@knz
Copy link
Copy Markdown
Contributor

knz commented Jun 10, 2021

TFYRs!

bors r=stevendanna,tbg,andreimatei

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 10, 2021

Build succeeded:

@craig craig bot merged commit 7a68e6e into cockroachdb:master Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

log: the intercept mechanism prevents entries from being emitted to logging

6 participants