Skip to content

[DNM]: gc garbage generator and KVDebug service#18661

Closed
tbg wants to merge 1 commit intocockroachdb:masterfrom
tbg:kv-debug
Closed

[DNM]: gc garbage generator and KVDebug service#18661
tbg wants to merge 1 commit intocockroachdb:masterfrom
tbg:kv-debug

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Sep 21, 2017

This PR shows the tooling I used to stress test the GC queue. In short, I needed a way to put
large amounts of intents on a single range; I didn't particularly care to do this on a multi-node
cluster, but I needed to do it efficiently for quick turnaround (and also to prevent the GC queue
from cleaning up my garbage faster than I could insert it).

This was also a good opportunity to investigate "better" debugging tools and to revisit the
ExternalServer interface, which historically has been the KV store we once wanted to expose to
clients. It has since become internal and is technically slated for removal, but at the same time it
has seen continued use. The reasons for keeping (something like it) are:

  1. debug running clusters that are potentially wedged due to invalid KV data. Be able to read
    transaction entries and raw KV columns that are unexpected to the SQL layer.
  2. in our testing, create problematic conditions that are unattainable by using the public
    interfaces (creating artificial GC pressure being one example)

I also think that there's a point to be made to add functionality such as being able to force a
Range to run garbage collection, etc, though that's out of scope here.

In this PR, I've sketched out a TxnCoordSender-level entry point that is tied to a bidirectional
streaming connection. This has the advantage that there is a context available the lifetime of which
is tied to the connection, which means that TxnCoordSender can base its transaction heartbeats
based on that (this is not to suggest that we should be running serious transactions through this
interface, but it establishes parity and, assuming that client.NewSender went through this
endpoint instead, TxnCoordSender could be simplified to always use the incoming context). There is
more subtlety in this topic since we want to merge TxnCoordSender and client.{DB,Txn} though,
so don't take this as a concrete suggestion.

What's been more immediately useful is a pretty low-level endpoint that allows evaluating a
BatchRequest on any given Replica (bypassing the command queue, etc) and seeing the results
(more controversially and important for gcpressurizer is the ability to execute these batches,
something that's quite dangerous in the wrong hands due to the potential of creating inconsistency
and also its insufficient synchronization with splits, etc). I think that's the part worth exploring
since it's a universally useful last resort when things go wrong and visibility into on-disk state
is desired without shutting down the node.

Long story short, I have this code and it's definitely not something to check in, but to discuss.
It'd be nice to programmatically test the GC queue in that way, and perhaps randomly "pollute"
some of our test clusters in ever-escalating ways, to improve their resilience.

@tbg tbg requested a review from a team September 21, 2017 14:55
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

This PR shows the tooling I used to [stress test the GC queue]. In short, I needed a way to put
large amounts of intents on a single range; I didn't particularly care to do this on a multi-node
cluster, but I needed to do it efficiently for quick turnaround (and also to prevent the GC queue
from cleaning up my garbage faster than I could insert it).

This was also a good opportunity to investigate "better" debugging tools and to revisit the
`ExternalServer` interface, which historically has been the KV store we once wanted to expose to
clients. It has since become internal and is technically slated for removal, but at the same time it
has seen continued use. The reasons for keeping (something like it) are:

1. debug running clusters that are potentially wedged due to invalid KV data. Be able to read
   transaction entries and raw KV columns that are unexpected to the SQL layer.
2. in our testing, create problematic conditions that are unattainable by using the public
   interfaces (creating artificial GC pressure being one example)

I also think that there's a point to be made to add functionality such as being able to force a
Range to run garbage collection, etc, though that's out of scope here.

In this PR, I've sketched out a TxnCoordSender-level entry point that is tied to a bidirectional
streaming connection. This has the advantage that there is a context available the lifetime of which
is tied to the connection, which means that `TxnCoordSender` can base its transaction heartbeats
based on that (this is not to suggest that we should be running serious transactions through this
interface, but it establishes parity and, assuming that `client.NewSender` went through this
endpoint instead, TxnCoordSender could be simplified to always use the incoming context). There is
more subtlety in this topic since we want to [merge] `TxnCoordSender` and `client.{DB,Txn}` though,
so don't take this as a concrete suggestion.

What's been more immediately useful is a pretty low-level endpoint that allows evaluating a
`BatchRequest` on any given `Replica` (bypassing the command queue, etc) and seeing the results
(more controversially and important for `gcpressurizer` is the ability to *execute* these batches,
something that's quite dangerous in the wrong hands due to the potential of creating inconsistency
and also its insufficient synchronization with splits, etc). I think that's the part worth exploring
since it's a universally useful last resort when things go wrong and visibility into on-disk state
is desired without shutting down the node.

Long story short, I have this code and it's definitely not something to check in, but to discuss.
It'd be nice to programmatically test the GC queue in that way, and perhaps randomly "pollute"
some of our test clusters in ever-escalating ways, to improve their resilience.

[stress test the GC queue]: cockroachdb#9540
[merge]: cockroachdb#16000
tbg added a commit to tbg/cockroach that referenced this pull request Apr 16, 2018
A simple data generator that makes a single large range. The created dataset can
then be used for various tests, for example to exercises issues as the ones
ultimately leading to cockroachdb#20589, or to
make sure [large snapshots] work (once implemented).

This is a work-in-progress because I haven't reached clarity on what the best
way to hook things up in these tests would be. Do we want to create the datasets
and upload them somewhere? That has been fragile in the past as the upload
progress usually gets seldom exercised and thus rots. The alternative (which I'm
leaning towards) is to bundle this binary with the test code (either explicitly
or via use as a library) and create fresh test data every time (these tests
would run as nightlies and so dataset generation speed isn't the top concern).

In making these decisions, we should also take into account more involved datasets
that can't as easily be generated from a running cluster, such as [gcpressurizer].

For those, my current take is that we'll just generate an initialized data dir,
open the resulting RocksDB instance manually again, and write straight into it
(via some facility that updates stats correctly, i.e. presumably `MVCCPut` and
friends).

Release note: None

[large snapshots]: cockroachdb#16954
[gcpressurizer]: cockroachdb#18661
@tbg tbg closed this Aug 15, 2018
@tbg tbg deleted the kv-debug branch August 20, 2018 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants