Skip to content

[wip] storage: add closed timestamp subsystem#27595

Closed
tbg wants to merge 17 commits intocockroachdb:masterfrom
tbg:ct-updates
Closed

[wip] storage: add closed timestamp subsystem#27595
tbg wants to merge 17 commits intocockroachdb:masterfrom
tbg:ct-updates

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Jul 15, 2018

This commit introduces the basic closed timestamp subsystem. The system
is far from complete (as a starter, it's 100% untested, so it definitely
has lots of bugs and won't work properly at all), but the point of this
PR is to get buy-in on the overall architectural approach, so that
shortcomings can be addressed before they are cemented in more code.

Great care has been taken to make each individual component self-
contained and cleanly interfaced to the outside world, which should aid
greatly in unit testing them in isolation. The package layout relative
to ct is the following:

.: slim root package containing only the most basic CT interfaces
./ctpb: protobuf definitions and service
./ctconfig: fat package pulling in everything needed to actually run
./ctstorage: storage for per-node closed timestamp state
./minprop: minimum proposal timestamp tracker
./provider: the Provider.

Most of these packages depend on ct (more precisely, on the interfaces
they need to use), but in turn don't depend on each other. I'm open to
changing the name of individual packages (for example, prefixing them
all with ct).

As an entry point for review, the following diagram together with the
ctconfig and ct packages should be instructive in getting a bird's
eye view of how the subsystem fits into CockroachDB: All of the
components are grouped in a Container (intended as a pass-around
per-instance Singleton) and is passed down to the Stores via the
StoreConfig. Replicas proposing commands talk to the Tracker; replicas
trying to serve follower reads talk to the Provider, which receives
closed timestamp updates for the local node and its peers.

                            Node 1 | Node 2
                                   |
+---------+  Close  +-----------+  |  +-----------+
| Tracker |<--------|           |  |  |           |
+-----+---+         | +-------+ |  |  | +-------+ |  CanServe
      ^             | |Storage| |  |  | |Storage| |<---------+
      |             | --------+ |  |  | +-------+ |          |
      |Track        |           |  |  |           |     +----+----+
      |             | Provider  |  |  | Provider  |     | Follower|
      |             +-----------+  |  +-----------+     | Replica |
      |                 ^                  ^            +----+----+
      |                 |Subscribe         |Notify           |
      |                 |                  |                 |
+---------+             |      Request     |                 |
|Proposing| Refresh +---+----+ <----- +----+-----+  Request  |
| Replica |<--------| Server |        |  Clients |<----------+
+---------+         +--------+ -----> +----------+  EnsureClient
                                 CT

Release note: None

@tbg tbg requested a review from a team July 15, 2018 12:04
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@tbg tbg requested a review from nvb July 15, 2018 12:04
@tbg tbg force-pushed the ct-updates branch 2 times, most recently from 2b24931 to 5bd7b9b Compare July 16, 2018 04:51
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see this pulled out of the storage package. It all should be able to strang on its own, and this is a big step towards that.

Reviewed 16 of 20 files at r1, 3 of 4 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/ct/ct.go, line 72 at r2 (raw file):

// A Notifyee is a sink for closed timestamp updates.
type Notifyee interface {
	Notify(roachpb.NodeID) chan<- ctpb.Entry

Add a comment here. Also, describe the special handling of nodeID=0, which you exploit later on.


pkg/storage/ct/ct.go, line 85 at r2 (raw file):

// EpochT is a wrapper around int64 which helps avoid confusing it with other
// int64s used for different purposes.
type EpochT struct {

type EpochT int64? Or better type Epoch int64 with no getters or setters. Same with LAIT? Maybe I'm missing why these structs are important.


pkg/storage/ct/ct.go, line 134 at r2 (raw file):

type PeerRegistry interface {
	// Request asks the peer node to produce an update for the supplied RangeID.
	Request(roachpb.NodeID, roachpb.RangeID)

Are all of these methods asynchronous? It's a little strange to see none of them return errors.


pkg/storage/ct/ct.go, line 146 at r2 (raw file):

// extra constraints that the local node is live for the returned timestamp at
// the given epoch.
type LiveClockFn func() (liveNow hlc.Timestamp, liveEpoch int64, _ error)

Can we use EpochT here? Or is this matching an existing function?


pkg/storage/ct/ctpb/entry.proto, line 29 at r2 (raw file):

  int64 epoch = 1;
  util.hlc.Timestamp closed_timestamp = 2 [(gogoproto.nullable) = false];
  bool full = 3;

Add comments to these fields. Specifically, I'm not 100% sure what the semantics of full are.


pkg/storage/ct/ctstorage/storage.go, line 56 at r2 (raw file):

// an ill-timed crash, and so historic information is invaluable.
//
// TODO(tschottdorf): this shouldn't be an interface but a concrete impl, the

+1, this sounds like a good idea.


pkg/storage/ct/provider/provider.go, line 15 at r2 (raw file):

// permissions and limitations under the License.

package provider

It is a little strange that some of the ct subpackages are prefixed with ct while others aren't. I'd make that uniform.

I'm also wondering if ct is a little too concise. Based on other packages in the storage package (e.g. idalloc, rditer, tscache) it seems like closedts might be a better fit because it gives a bit more insight when using it.


pkg/storage/ct/provider/provider.go, line 137 at r2 (raw file):

// Notify implements ct.Notifyee.
func (p *Provider) Notify(nodeID roachpb.NodeID) chan<- ctpb.Entry {

Do we expect this to be called once per store or once per replica. I think the former, which is fine. If it's the latter though, I worry about the number of goroutines we'll have to create here.


pkg/storage/ct/provider/provider.go, line 156 at r2 (raw file):

	p.mu.Lock()
	for i = range p.mu.subscribers {
		if p.mu.subscribers == nil {

What's going on here? We're iterating over a slice and then looking at whether that slice is nil? Is this even reachable? Did you mean p.mu.subscribers[i] == nil? Is this to allow for re-use within the slice?


pkg/storage/ct/provider/provider.go, line 220 at r2 (raw file):

	var ok bool
	// TODO(tschottdorf): add and use VisitDescending.
	p.cfg.Storage.VisitAscending(nodeID, func(entry ctpb.Entry) bool {

It seems like this lookup could be more efficient. It's surprising to me that we'd have to iterate over all historical states. This is going to be on the hot path of reads, so I think we want the index structure here to be optimized around this lookup. I don't have any great suggestions at the moment though.

Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments addressed (with the exception of package renames, for which I'm not sure we've come up with something good yet). Is it time to fill in the missing code or are there any bigger reorgs that you're considering suggesting?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/ct/ct.go, line 72 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Add a comment here. Also, describe the special handling of nodeID=0, which you exploit later on.

Done.


pkg/storage/ct/ct.go, line 85 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

type EpochT int64? Or better type Epoch int64 with no getters or setters. Same with LAIT? Maybe I'm missing why these structs are important.

You're right, I forgot how this worked. I thought that without the struct you could still use an Epoch as an MLAI in too many places, but it's not true (ok, https://play.golang.com/p/ONjx2KoSyGn now runs, but I think that's fine as we wouldn't ever pass consts in production code). Done.


pkg/storage/ct/ct.go, line 134 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Are all of these methods asynchronous? It's a little strange to see none of them return errors.

I tried to clarify the comments but yes, basically everything is asynchronous here. Calling Request is just saying "hey, next time you talk to that node, please send it these rangeIDs so that it'll emit updates for them".


pkg/storage/ct/ct.go, line 146 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we use EpochT here? Or is this matching an existing function?

This line predated the introduction of EpochT, so I forgot to add it. Changed.


pkg/storage/ct/ctpb/entry.proto, line 29 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Add comments to these fields. Specifically, I'm not 100% sure what the semantics of full are.

Done.


pkg/storage/ct/ctstorage/storage.go, line 56 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

+1, this sounds like a good idea.

Now that I'm looking at it again, I'd like to wait once I've written tests. For introducing persistent storage we'd not write another impl of this interface (that's what the comment said) but I may still want to unit test the MultiStorage and use the interface in that context.


pkg/storage/ct/provider/provider.go, line 15 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It is a little strange that some of the ct subpackages are prefixed with ct while others aren't. I'd make that uniform.

I'm also wondering if ct is a little too concise. Based on other packages in the storage package (e.g. idalloc, rditer, tscache) it seems like closedts might be a better fit because it gives a bit more insight when using it.

./storage/closedts
./storage/closedts/closedtstransport

Like that? I like closedts for the root package, but then using it as a prefix for the others seems too much. OTOH, it seems awkward to keep using ct there. WDYT?


pkg/storage/ct/provider/provider.go, line 137 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we expect this to be called once per store or once per replica. I think the former, which is fine. If it's the latter though, I worry about the number of goroutines we'll have to create here.

It's once per node, so the number of goroutines is at most the number of nodes in the cluster.


pkg/storage/ct/provider/provider.go, line 156 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What's going on here? We're iterating over a slice and then looking at whether that slice is nil? Is this even reachable? Did you mean p.mu.subscribers[i] == nil? Is this to allow for re-use within the slice?

Yeah, this is definitely the kind of bug that people write unit tests for. Fixed.


pkg/storage/ct/provider/provider.go, line 220 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It seems like this lookup could be more efficient. It's surprising to me that we'd have to iterate over all historical states. This is going to be on the hot path of reads, so I think we want the index structure here to be optimized around this lookup. I don't have any great suggestions at the moment though.

I think that VisitDescending (see the TODO above which I just took care of) addresses this sufficiently as the first bucket you visit will typically be the one that allows serving the request already. Besides, I don't think we'll have more than two buckets now and potentially a handful more (below 10) later, at which point any type of fancy data structure is not likely to do better.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it time to fill in the missing code or are there any bigger reorgs that you're considering suggesting?

You can begin filling in the missing code. I don't have any large reorg suggestions. Thanks for structuring this well!

Reviewed 2 of 11 files at r4, 2 of 8 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/ct/ctpb/entry.proto, line 29 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Done.

The description still doesn't give a lot of insight into what "full" means. Does it mean that the mlai map includes all ranges on the store that's sending the entry?


pkg/storage/ct/ctstorage/storage.go, line 56 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Now that I'm looking at it again, I'd like to wait once I've written tests. For introducing persistent storage we'd not write another impl of this interface (that's what the comment said) but I may still want to unit test the MultiStorage and use the interface in that context.

Ack.


pkg/storage/ct/provider/provider.go, line 15 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…
./storage/closedts
./storage/closedts/closedtstransport

Like that? I like closedts for the root package, but then using it as a prefix for the others seems too much. OTOH, it seems awkward to keep using ct there. WDYT?

./storage/closedts
./storage/closedts/transport

Is what I would suggest. You can always import ./storage/closedts/transport as cttransport.

@tbg tbg requested a review from a team as a code owner July 20, 2018 10:52
@tbg tbg requested review from a team July 20, 2018 10:52
Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No functional changes in latest push, just a bunch of renaming.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/ct/ctpb/entry.proto, line 29 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The description still doesn't give a lot of insight into what "full" means. Does it mean that the mlai map includes all ranges on the store that's sending the entry?

Yep, basically. But most importantly, it allows the receiver to assume that "no new MLAI is equal to previously seen MLAI". I expanded the comment.


pkg/storage/ct/provider/provider.go, line 15 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
./storage/closedts
./storage/closedts/transport

Is what I would suggest. You can always import ./storage/closedts/transport as cttransport.

Done. I renamed config -> container in the process. ctpb remains (since we always use that naming scheme for the protobuf packages).

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 19 files at r10, 5 of 22 files at r11, 4 of 4 files at r13.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/base/config.go, line 284 at r13 (raw file):

// GetClientTLSConfig returns the client TLS config, initializing it if needed.
// If Insecure is true, return a nil config, otherwise ask the certificate
// manager for a TLS config using certs for the container.User.

There were a number of s/config/container replacements in this commit that I don't think you meant to make.


pkg/storage/closedts/closedts.go, line 19 at r13 (raw file):

//
// The following diagram illustrates how these components fit together. In
// running operation, the components are grouped in a ctconfig.Container

s/ctconfig/container


pkg/storage/closedts/container/container.go, line 63 at r13 (raw file):

func NewContainer(cfg Config) *Container {
	storage := storage.NewMultiStorage(func() storage.SingleStorage {
		return storage.NewMemStorage(5*closedts.TargetDuration.Get(&cfg.Settings.SV), 2)

Pull these numbers out into constants.


pkg/storage/closedts/container/container.go, line 123 at r13 (raw file):

func DialerAdapter(dialer *nodedialer.Dialer) closedts.Dialer {
	return (*dialerAdapter)(dialer)

Extra line.


pkg/storage/closedts/container/container_test.go, line 48 at r13 (raw file):

}

func TestContainer(t *testing.T) {

Are you planning on adding to this test?


pkg/storage/closedts/provider/provider.go, line 88 at r13 (raw file):

	defer close(ch)

	t := timeutil.NewTimer()

You can just put this on the stack. Also, don't you need to defer t.Stop()?


pkg/storage/closedts/provider/provider.go, line 129 at r13 (raw file):

func (p *Provider) processNotification(nodeID roachpb.NodeID, entry ctpb.Entry) {
	p.cfg.Storage.Add(nodeID, entry)
	if nodeID == 0 {

I'd pull 0 into a constant. const localNode = 0 or something.


pkg/storage/closedts/provider/provider.go, line 141 at r13 (raw file):

func (p *Provider) Notify(nodeID roachpb.NodeID) chan<- ctpb.Entry {
	ch := make(chan ctpb.Entry)
	ctx := context.TODO()

Should this be Background or should we modify the interface to pass a context?


pkg/storage/closedts/provider/provider.go, line 154 at r13 (raw file):

// Subscribe implements closedts.Producer.
func (p *Provider) Subscribe(ctx context.Context, ch chan<- ctpb.Entry) {
	var i int

nit: consider pulling everything that uses i into a tighter scope.


pkg/storage/closedts/provider/provider.go, line 189 at r13 (raw file):

	for {
		p.cond.L.Lock()
		p.cond.Wait()

What happens if the provider's ct-closer task exits before we get here? Do we deadlock? Do we need to drain all subscriptions before allowing ct-closer to finish?


pkg/storage/closedts/storage/storage.go, line 80 at r13 (raw file):

// MultiStorage implements the closedts.Storage interface.
type MultiStorage struct {
	f func() SingleStorage

What is f? A constructor? I'd at least leave comment mentioning that, or give this a better name.


pkg/storage/closedts/storage/storage_test.go, line 26 at r13 (raw file):

)

func ExampleSingleStorage() {

These examples are really nice!


pkg/storage/ct/ctpb/entry.proto, line 29 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Yep, basically. But most importantly, it allows the receiver to assume that "no new MLAI is equal to previously seen MLAI". I expanded the comment.

Thanks.

Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added rudimentary testing for the transport package. Hopefully doing the same for provider and container tomorrow and then I'll hook things up.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/base/config.go, line 284 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

There were a number of s/config/container replacements in this commit that I don't think you meant to make.

Done.


pkg/storage/closedts/closedts.go, line 19 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/ctconfig/container

Done.


pkg/storage/closedts/container/container.go, line 63 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Pull these numbers out into constants.

Done. I think they'll require some tweaking, but for now it'll do.


pkg/storage/closedts/container/container.go, line 123 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Extra line.

Done.


pkg/storage/closedts/container/container_test.go, line 48 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Are you planning on adding to this test?

Yes, this is just. a stub.


pkg/storage/closedts/provider/provider.go, line 88 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

You can just put this on the stack. Also, don't you need to defer t.Stop()?

Done.


pkg/storage/closedts/provider/provider.go, line 129 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd pull 0 into a constant. const localNode = 0 or something.

I actually backed out this special casing. Was more trouble than it's worth.


pkg/storage/closedts/provider/provider.go, line 141 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this be Background or should we modify the interface to pass a context?

I'll use context.Background for now.


pkg/storage/closedts/provider/provider.go, line 154 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: consider pulling everything that uses i into a tighter scope.

It's used all the way through, and I don't want to take a reference into something that is protected by the mutex.


pkg/storage/closedts/provider/provider.go, line 189 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What happens if the provider's ct-closer task exits before we get here? Do we deadlock? Do we need to drain all subscriptions before allowing ct-closer to finish?

Good point, done (though I have little confidence in the correctness until it's tested).


pkg/storage/closedts/storage/storage.go, line 80 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What is f? A constructor? I'd at least leave comment mentioning that, or give this a better name.

Done.


pkg/storage/closedts/storage/storage_test.go, line 26 at r13 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

These examples are really nice!

Thanks! I like this format too.

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 24, 2018

Added end-to-end-ish tests (pkg/storage/closedts/container).

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Jul 24, 2018

@nvanbenschoten this PR is getting... pretty large. I'm almost done writing tests for closedts, and next up would be hooking this up into the actual system. I'd much prefer doing that latter part in an own PR, which means that what's here should be reviewed with that in mind.

tbg added 8 commits July 25, 2018 12:33
This commit introduces the basic closed timestamp subsystem. The system
is far from complete (as a starter, it's 100% untested, so it definitely
has lots of bugs and won't work properly at all), but the point of this
PR is to get buy-in on the overall architectural approach, so that
shortcomings can be addressed before they are cemented in more code.

Great care has been taken to make each individual component self-
contained and cleanly interfaced to the outside world, which should aid
greatly in unit testing them in isolation. The package layout relative
to `ct` is the following:

.: slim root package containing only the most basic CT interfaces
./ctpb: protobuf definitions and service
./ctconfig: fat package pulling in everything needed to actually run
./ctstorage: storage for per-node closed timestamp state
./minprop: minimum proposal timestamp tracker
./provider: the Provider.

Most of these packages depend on `ct` (more precisely, on the interfaces
they need to use), but in turn don't depend on each other. I'm open to
changing the name of individual packages (for example, prefixing them
all with `ct`).

As an entry point for review, the following diagram together with the
`ctconfig` and `ct` packages should be instructive in getting a bird's
eye view of how the subsystem fits into CockroachDB: All of the
components are grouped in a Container (intended as a pass-around
per-instance Singleton) and is passed down to the Stores via the
StoreConfig. Replicas proposing commands talk to the Tracker; replicas
trying to serve follower reads talk to the Provider, which receives
closed timestamp updates for the local node and its peers.

```
                            Node 1 | Node 2
                                   |
+---------+  Close  +-----------+  |  +-----------+
| Tracker |<--------|           |  |  |           |
+-----+---+         | +-------+ |  |  | +-------+ |  CanServe
      ^             | |Storage| |  |  | |Storage| |<---------+
      |             | --------+ |  |  | +-------+ |          |
      |Track        |           |  |  |           |     +----+----+
      |             | Provider  |  |  | Provider  |     | Follower|
      |             +-----------+  |  +-----------+     | Replica |
      |                 ^                  ^            +----+----+
      |                 |Subscribe         |Notify           |
      |                 |                  |                 |
+---------+             |      Request     |                 |
|Proposing| Refresh +---+----+ <----- +----+-----+  Request  |
| Replica |<--------| Server |        |  Clients |<----------+
+---------+         +--------+ -----> +----------+  EnsureClient
                                 CT
```

Release note: None
jeez, these shook out a bunch of bugs.
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 5 of 20 files at r18, 2 of 7 files at r23, 4 of 14 files at r24, 1 of 4 files at r25, 2 of 3 files at r28, 6 of 8 files at r29, 5 of 6 files at r30.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/storage/closedts/provider/provider_test.go, line 48 at r30 (raw file):

	st := cluster.MakeTestingClusterSettings()
	// We'll only unleash the closer loop only when the test is basically done, and

nit: two "only"s


pkg/storage/closedts/container/container_test.go, line 130 at r30 (raw file):

}

func TestTwoNodes(t *testing.T) {

Great test!


pkg/storage/closedts/transport/server.go, line 57 at r30 (raw file):

	ch := make(chan ctpb.Entry, 10)

	// TODO: X*closedts.CloseFraction*TargetInterval

Did you mean to address this now?

Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! Closing for #27922 (which doesn't have the internal churn that confuses the stress tester).

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/storage/closedts/container/container_test.go, line 130 at r30 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Great test!

Done.


pkg/storage/closedts/transport/server.go, line 57 at r30 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did you mean to address this now?

No, but I made this a proper TODO.

@tbg tbg closed this Jul 25, 2018
craig bot pushed a commit that referenced this pull request Jul 26, 2018
27922: storage: add closed timestamp subsystem r=nvanbenschoten a=tschottdorf

NB: This is a reopen of #27595 which needs a fix to the CI stress tester
to merge before it can pass CI (and I don't want to wait for that). No
need to review.

----

This commit introduces the basic closed timestamp subsystem. It does
not hook it up into the CockroachDB node yet.

Great care has been taken to make each individual component self-
contained and cleanly interfaced to the outside world. The package
layout relative to `closedts` is the following:

.: slim root package containing only the most basic CT interfaces
./ctpb: protobuf definitions and service
./config: fat package pulling in everything needed to actually run
./storage: storage for per-node closed timestamp state
./minprop: minimum proposal timestamp tracker
./provider: the Provider, centralizing all closed timestamp information.

Most of these packages depend on `closedts` (more precisely, on the
interfaces they need to use), but in turn don't depend on each other.

The following diagram together with the `ctconfig` and `ct` packages
should be instructive in getting a bird's eye view of how the subsystem
fits into CockroachDB: All of the components are grouped in a Container
(intended as a pass-around per-instance Singleton) and is passed down to
the Stores via the StoreConfig. Replicas proposing commands talk to the
Tracker; replicas trying to serve follower reads talk to the Provider,
which receives closed timestamp updates for the local node and its
peers.

```
                            Node 1 | Node 2
                                   |
+---------+  Close  +-----------+  |  +-----------+
| Tracker |<--------|           |  |  |           |
+-----+---+         | +-------+ |  |  | +-------+ |  CanServe
      ^             | |Storage| |  |  | |Storage| |<---------+
      |             | --------+ |  |  | +-------+ |          |
      |Track        |           |  |  |           |     +----+----+
      |             | Provider  |  |  | Provider  |     | Follower|
      |             +-----------+  |  +-----------+     | Replica |
      |                 ^                  ^            +----+----+
      |                 |Subscribe         |Notify           |
      |                 |                  |                 |
+---------+             |      Request     |                 |
|Proposing| Refresh +---+----+ <----- +----+-----+  Request  |
| Replica |<--------| Server |        |  Clients |<----------+
+---------+         +--------+ -----> +----------+  EnsureClient
                                 CT
```

Release note: None

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@tbg tbg deleted the ct-updates branch July 26, 2018 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants