Skip to content

cli: identify log source more easily when running merge-logs#71254

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jtsiros:55395-unique-merge-logs
Oct 20, 2021
Merged

cli: identify log source more easily when running merge-logs#71254
craig[bot] merged 1 commit intocockroachdb:masterfrom
jtsiros:55395-unique-merge-logs

Conversation

@jtsiros
Copy link
Copy Markdown

@jtsiros jtsiros commented Oct 6, 2021

resolves #55395

Previously, merge-logs output was prefixed by short machine name by default which made it difficult to identify the originating node when looking at the merged results.

This change adds support for ${fpath} in the template when using the --prefix arg to include unique file path contents as part of the prefix. The file path prefix excludes common tokens across all file paths and tokens that contain specific delimiters.

Release note (cli change): adds support for "${fpath}" in --prefix arg

@jtsiros jtsiros requested review from cameronnunez and knz October 6, 2021 23:45
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@cameronnunez cameronnunez added A-logging In and around the logging infrastructure. T-server-and-security DB Server & Security C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Oct 7, 2021
@cameronnunez cameronnunez requested review from ajwerner and removed request for cameronnunez October 7, 2021 18:07
@cameronnunez cameronnunez marked this pull request as ready for review October 7, 2021 18:07
@cameronnunez cameronnunez requested review from a team as code owners October 7, 2021 18:07
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 26 of 41 files at r1, 24 of 28 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jtsiros)


pkg/cli/prefixer.go, line 31 at r3 (raw file):

}

//lint:ignore U1001,U1000 used for overriding default values

why is this linter exemption needed?

@knz
Copy link
Copy Markdown
Contributor

knz commented Oct 11, 2021

@cameronnunez explains the purpose of the linter exception: the PR is creating an API but the API is not currently used.

The way forward in that case is not to create a linter exception. The way forward with an (yet-unused) API is to create unit tests for the new API. Once the tests use the API, the linter will stop complaining.

(It would be useful to take a step back and reflect on the fact that the linter is trying to be helpful. If the linter complains, there is something to learn from it; the reaction should not be to reach out for a way to silence the linter. It's very rare that the linter creates unwise/unactionable noise.)

@ajwerner
Copy link
Copy Markdown
Contributor

This seems a heck of a lot less flexible. Not everybody has their logs from a debug zip. This seems to hard code the assumption that you're only using this thing with a debug zip. How about instead you let users specify a regexp with capture groups over the path as opposed to just the filename and then change the log here for what goes in the prefix flag by default?

Copy link
Copy Markdown
Author

@jtsiros jtsiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I reverted the removal of the --prefix flag and added support for an optional ${fpath} arg in the template. @knz mentioned that he was OK with the default behavior, so I didn't update the default arg. Let me know if you think it should be included by default.

Reviewed 41 of 41 files at r1, 25 of 28 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jtsiros, and @knz)

@jtsiros
Copy link
Copy Markdown
Author

jtsiros commented Oct 18, 2021


pkg/cli/prefixer.go, line 31 at r3 (raw file):

Previously, knz (kena) wrote…

why is this linter exemption needed?

fixed by adding tests.

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Oct 18, 2021

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Oct 18, 2021
Copy link
Copy Markdown
Contributor

@rauchenstein rauchenstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 54 files at r5, 1 of 3 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jtsiros, and @knz)


pkg/cli/debug.go, line 1366 at r7 (raw file):

}

type prefixer interface {

Given that only one type implements it, the interface doesn't seem necessary.


pkg/cli/debug_merge_logs.go, line 329 at r7 (raw file):

type fileInfo struct {
	path    string
	prefix  []byte

I'd think this should be a string. (Just cast the return value from the regexp library.)


pkg/cli/prefixer.go, line 23 at r7 (raw file):

var defaultFilePrefixerOptions = filePrefixerOptions{
	Delimiters: []string{"/", ".", "-"},
	Splitter:   regexp.MustCompile(`[/\\]+`),

Do we need the flexibility of regexp Splitters? Is filepath.SplitList not suitable?


pkg/cli/prefixer.go, line 58 at r7 (raw file):

// delimiters and template.
//
func newFilePrefixer(opts ...filePrefixerOption) filePrefixer {

Outside of tests, this is only ever called with withTemplate. Unless more flexibility is actually needed at some later point, just newFilePrefixer(template string) seems sufficient.


pkg/cli/prefixer.go, line 70 at r7 (raw file):

}

// Prefix generates a prefix for each file path directory given a

Populating fields on the input slice is a non-obvious way to return results. Document it in the comment, and perhaps name the function something like PopulatePrefixes.


pkg/cli/prefixer.go, line 91 at r7 (raw file):

// See [debug_merge_logs_test.go, prefixer_test.go] for additional examples.
//
func (f filePrefixer) Prefix(logFiles []fileInfo) {

This algorithm doesn't handle tokens appearing multiple times in the same path. Given the two files

a/a/foo.log
b/c/b/bar.log

it would identify "a" and "b" as common tokens.

Related would be a case like

a/b/foo.log
b/a/bar.log

pkg/cli/prefixer.go, line 138 at r7 (raw file):

}

func hasDelimiters(token string, del []string) bool {

Maybe this should be a method of filePrefixer?


pkg/cli/prefixer_test.go, line 24 at r7 (raw file):

var (
	fileRE = regexp.MustCompile(logFilePattern)
	fInfo  = func(path string, fileRE *regexp.Regexp) fileInfo {

Why a package-scope variable lambda instead of a normal func or a lambda inside the test function?

@jtsiros
Copy link
Copy Markdown
Author

jtsiros commented Oct 19, 2021


pkg/cli/prefixer.go, line 91 at r7 (raw file):

Previously, rauchenstein wrote…

This algorithm doesn't handle tokens appearing multiple times in the same path. Given the two files

a/a/foo.log
b/c/b/bar.log

it would identify "a" and "b" as common tokens.

Related would be a case like

a/b/foo.log
b/a/bar.log

It handles tokens appearing multiple times in the same path by checking "seen" when iterating over the tokens (see line 25). Tokens aren't counted more than once.

Copy link
Copy Markdown
Author

@jtsiros jtsiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 54 of 54 files at r5, 3 of 3 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jtsiros, @knz, and @rauchenstein)


pkg/cli/debug.go, line 1366 at r7 (raw file):

Previously, rauchenstein wrote…

Given that only one type implements it, the interface doesn't seem necessary.

I'll change it - thanks!


pkg/cli/debug_merge_logs.go, line 329 at r7 (raw file):

Previously, rauchenstein wrote…

I'd think this should be a string. (Just cast the return value from the regexp library.)

I originally had this as a string, but was casting the value twice (return value from regexp, then back to []byte for log.FormatLegacyEntryPrefix). Let me know if you still prefer it to be a string.


pkg/cli/prefixer.go, line 23 at r7 (raw file):

Previously, rauchenstein wrote…

Do we need the flexibility of regexp Splitters? Is filepath.SplitList not suitable?

It probably is now: this was intended to handle multiple split tokens to pick up the file name before we discussed preserving existing functionality.


pkg/cli/prefixer.go, line 70 at r7 (raw file):

Previously, rauchenstein wrote…

Populating fields on the input slice is a non-obvious way to return results. Document it in the comment, and perhaps name the function something like PopulatePrefixes.

Will do.


pkg/cli/prefixer.go, line 138 at r7 (raw file):

Previously, rauchenstein wrote…

Maybe this should be a method of filePrefixer?

Will do.


pkg/cli/prefixer_test.go, line 24 at r7 (raw file):

Previously, rauchenstein wrote…

Why a package-scope variable lambda instead of a normal func or a lambda inside the test function?

I originally had multiple tests and consolidated them into a test table. I'll move it.

Previously, merge-logs output was prefixed by short machine name
by default which made it difficult to identify the originating
node when looking at the merged results.

This change adds support for ${fpath} in the template when using the
--prefix arg to include unique file path contents as part of the prefix.
The file path prefix excludes common tokens across all file paths
and tokens that contain specific delimiters.

Release note (cli change): adds support for "${fpath}" in --prefix
arg
Copy link
Copy Markdown
Author

@jtsiros jtsiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @knz, and @rauchenstein)


pkg/cli/prefixer.go, line 58 at r7 (raw file):

Previously, rauchenstein wrote…

Outside of tests, this is only ever called with withTemplate. Unless more flexibility is actually needed at some later point, just newFilePrefixer(template string) seems sufficient.

Generally, I agree. I removed the delimiter configuration for now. I do prefer this pattern as it makes it easily extendable (by adding additional withX methods as needed) and avoids conditional logic in newFilePrefixer for default values.

Copy link
Copy Markdown
Contributor

@rauchenstein rauchenstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jtsiros, @knz, and @rauchenstein)


pkg/cli/debug_merge_logs.go, line 329 at r7 (raw file):

Previously, jtsiros (Jon Tsiros) wrote…

I originally had this as a string, but was casting the value twice (return value from regexp, then back to []byte for log.FormatLegacyEntryPrefix). Let me know if you still prefer it to be a string.

Nope, that sounds like good justification.


pkg/cli/prefixer.go, line 91 at r7 (raw file):

Previously, jtsiros (Jon Tsiros) wrote…

It handles tokens appearing multiple times in the same path by checking "seen" when iterating over the tokens (see line 25). Tokens aren't counted more than once.

Well, I'm blind -- I was thinking that exactly what you already did would be a reasonable fix, yet I still missed it.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jtsiros, @knz, and @rauchenstein)

@jtsiros
Copy link
Copy Markdown
Author

jtsiros commented Oct 20, 2021

bors r=ajwerner,cameronnunez,rauchenstein

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 20, 2021

Build succeeded:

@jtsiros jtsiros deleted the 55395-unique-merge-logs branch October 20, 2021 17:26
@jtsiros jtsiros restored the 55395-unique-merge-logs branch October 20, 2021 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-logging In and around the logging infrastructure. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-community Originated from the community T-server-and-security DB Server & Security

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cli: debug merge-logs output should include node id to ensure uniqueness

6 participants