Skip to content

server: Allow bypass of tsdump's strict checks#77247

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rimadeodhar:tsdump_improve
Mar 9, 2022
Merged

server: Allow bypass of tsdump's strict checks#77247
craig[bot] merged 1 commit intocockroachdb:masterfrom
rimadeodhar:tsdump_improve

Conversation

@rimadeodhar
Copy link
Copy Markdown
Collaborator

@rimadeodhar rimadeodhar commented Mar 1, 2022

tsdump performs strict checks that verify that it
has data for exactly the stores it expects to see
according to the node->store mapping. It can happen
that there legitimately isn't data there, for example
when nodes are down during the tsdump window. We still
want the strict checks, but now tsdump will defer
returning an error and allow restarting with a "-"
in lieu of the tsimport file name to display the data
anyway.

Additionally, cluster settings are set via SET CLUSTER SETTING
as opposed to overriding settings directly.

Release note (cli change): debug tsdump command allows
viewing timeseries data even in case of node failures
by rerunning the command with the import filename set
to "-".

Addresses #75993

Release justification: low risk, high benefit changes to existing functionality

@rimadeodhar rimadeodhar requested review from a team and tbg March 1, 2022 21:38
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but it's code I wrote, so I'll remove myself from reviewers :-)

Copy link
Copy Markdown
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality :lgtm:, just a tiny nit and request for some additional comments

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @abarganier and @rimadeodhar)


pkg/server/import_ts.go, line 75 at r1 (raw file):

		"SET CLUSTER SETTING timeseries.storage.enabled = 'false';",
		"SET CLUSTER SETTING	timeseries.storage.resolution_10s.ttl = '99999h';",
		"SET CLUSTER SETTING	timeseries.storage.resolution_30m.ttl = '99999h';",

nit: swap tab for space after SET CLUSTER SETTING to stay consistent with the first two strings.

Code quote:

		"SET CLUSTER SETTING	timeseries.storage.resolution_10s.ttl = '99999h';",
		"SET CLUSTER SETTING	timeseries.storage.resolution_30m.ttl = '99999h';",

pkg/server/import_ts.go, line 88 at r1 (raw file):

	}
	if tsImport == "-" {
		// YOLO mode to look at timeseries after previous import error.

nit: Can we provide a bit more context in the comments about how this works? Being new to tsdump it took a bit for me to realize that this indicates that the tsdump data was already imported in the previous attempt to start the node, but with errors. Before realizing this, my thought was, "So if we return nil and return early, then how does the data get imported?". Elaborating that here might be helpful 🙂.

Code quote:

	if tsImport == "-" {
		// YOLO mode to look at timeseries after previous import error.
		return nil
	}

@rimadeodhar rimadeodhar force-pushed the tsdump_improve branch 2 times, most recently from 0cd9c58 to 8d81ae0 Compare March 8, 2022 17:29
@rimadeodhar rimadeodhar requested a review from a team as a code owner March 8, 2022 17:29
tsdump performs strict checks that verify that it
has data for exactly the stores it expects to see
according to the node->store mapping. It can happen
that there legitimately isn't data there, for example
when nodes are down during the tsdump window. We still
want the strict checks, but now tsdump will defer
returning an error and allow restarting with a "-"
in lieu of the tsimport file name  to display the data
anyway.

Additionally, cluster settings are set via SET CLUSTER SETTING
as opposed to overriding settings directly.

Release note (cli change): debug tsdump command allows
viewing timeseries data even in case of node failures
by rerunning the command with the import filename set
to "-".

Release justification: low risk, high benefit changes to existing functionality
@rimadeodhar
Copy link
Copy Markdown
Collaborator Author

TFTR!

bors r=abarganier

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 9, 2022

Build succeeded:

@craig craig bot merged commit f15acd1 into cockroachdb:master Mar 9, 2022
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Mar 9, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating backport branch refs/heads/blathers/backport-release-21.1-77247: POST https://api.github.com/repos/cockroachlabs/cockroach/git/refs: 403 Resource not accessible by integration []

Backport to branch 21.1.x failed. See errors above.


error creating backport branch refs/heads/blathers/backport-release-21.2-77247: POST https://api.github.com/repos/cockroachlabs/cockroach/git/refs: 403 Resource not accessible by integration []

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@irfansharif
Copy link
Copy Markdown
Contributor

I got this funky tsdump yaml file over at https://github.com/cockroachlabs/support/issues/2036#issuecomment-1409722641, and ran into these "consider updating the mapping file .." suggestion, but it didn't really help. I ended up just getting rid of all these strict checks period. Just reporting incase someone feels obliged to improve the status quo, or finds this PR like I did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants