util/log: new output format 'crdb-v2'#59087
Conversation
|
Remaining work:
|
0d4757c to
19d82e0
Compare
|
this is now ready for review. |
133d6a2 to
c164d4a
Compare
itsbilal
left a comment
There was a problem hiding this comment.
Great work! Just one minor question, but this is looking ready to go.
Reviewed 3 of 3 files at r1, 22 of 22 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @knz and @thtruo)
docs/generated/logformats.md, line 246 at r2 (raw file):
associated, for example header entries in file sinks, the counter position in the common prefix appears as a sequence of two ASCII space characters, which can be used to reliably identify this situation.
Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.
pkg/util/log/logconfig/config.go, line 27 at r2 (raw file):
// DefaultFileFormat is the entry format for file sinks when not // specified in a configuration. const DefaultFileFormat = `crdb-v2`
🎉
This was not done previously as I had some doubt that maybe some log directive was using a format starting with `"%d ..."` i.e. a log message would start with a number and a space. However, `git grep` confirms this is not the case. So it is safe to extract the entry counter this way. Release note: None
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @thtruo)
docs/generated/logformats.md, line 246 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.
Clarified the phrasing: yes, it is reliable. There's always a space before and after, so if there are two spaces in a row that means there was no counters.
This new format intends to address all the known shortcomings with
`crdb-v1` while remaining compatible with entry parsers designed for
the previous version.
See the user-facing release note below for a summary of changes; and
the included reference documentation for details.
Example for a single-line unstructured entry.
Before:
```
I210116 22:17:03.736236 57 cli/start.go:681 ⋮ node startup completed:
```
After:
```
I210116 22:17:03.736236 57 cli/start.go:681 ⋮ [-] 12 node startup completed:
tag field now always included ^^^
entry counter now always included ^^^
```
Example for a multi-line unstructured entry.
Before:
```
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 gossip status (ok, 1 node‹›)
gossip client (0/3 cur/max conns)
gossip connectivity
n1 [sentinel];
```
(subsequent lines lack a log entry prefix; hard to determine where
entries start and end)
After:
```
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 gossip status (ok, 1 node‹›)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip client (0/3 cur/max conns)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip connectivity
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 + n1 [sentinel];
^^^ common prefix repeated for each msg line
same entry counter for every subsequent line ^^^
continuation marker "+" on susequent lines ^^
```
Example for a structured entry.
Before:
```
I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] Structured entry: {...}
```
After:
```
I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] 21 ={...}
entry counter always present ^^
equal sign "=" denotes structured entry ^^
```
Release note (cli change): The default output format for `file-group`
and `stderr` sinks has been changed to `crdb-v2`.
This new format is non-ambiguous and makes it possible to reliably
parse log files. Refer to the format's documentation for
details. Additionally, it prevents single log lines from exceeding a
large size; this problem is inherent to the `crdb-v1` format and can
prevent `cockroach debug zip` from retrieving v1 log files.
The new format has also been designed so that existinglog file
analyzers for the `crdb-v1` format can read entries written the new
format. However, this conversion may be imperfect. Again, refer to
the new format's documentation for details.
In case of incompatibility, users can force the previous format by
using `format: crdb-v1` in their logging configuration.
|
I also added a formatting test so we have side-by-side examples of entries with and without counters, to clarify. TFYR! bors r=itsbilal |
|
Build succeeded: |
Fixes #50166.
This new format intends to address all the known shortcomings with
crdb-v1while remaining compatible with entry parsers designed for the previous version.See the user-facing release note below for a summary of changes; and the included reference documentation for details.
Example TTY output with colors:

Example for a single-line unstructured entry.
Before:
After:
Example for a multi-line unstructured entry.
Before:
(subsequent lines lack a log entry prefix; hard to determine where
entries start and end)
After:
Example for a structured entry.
Before:
After:
Release note (cli change): The default output format for
file-groupand
stderrsinks has been changed tocrdb-v2.This new format is non-ambiguous and makes it possible to reliably
parse log files. Refer to the format's documentation for
details. Additionally, it prevents single log lines from exceeding a
large size; this problem is inherent to the
crdb-v1format and canprevent
cockroach debug zipfrom retrieving v1 log files.The new format has also been designed so that existinglog file
analyzers for the
crdb-v1format can read entries written the newformat. However, this conversion may be imperfect. Again, refer to
the new format's documentation for details.
In case of incompatibility, users can force the previous format by
using
format: crdb-v1in their logging configuration.