ptreconcile,server: rework ptreconciler for multi-tenant#75688
ptreconcile,server: rework ptreconciler for multi-tenant#75688craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
(Suggestion, can the explanation in this commit message be a comment closer to the code?) |
7918494 to
80e6af1
Compare
miretskiy
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, and @arulajmani)
pkg/kv/kvserver/protectedts/ptreconcile/reconciler.go, line 34 at r1 (raw file):
// When set to zero - disables the report generation. var ReconcileInterval = settings.RegisterDurationSetting( settings.TenantWritable,
why would this be tenant writable? Are we okay w/ tenant disabling this? Or setting it to 1ns?
Hmm, good point. Every reconciliation will query the jobs/schedules table given a particular record, so setting this to 1ns might not be ideal? OTOH the transaction is very short lived in that it's a point query, and then a release of the record if at all. My hunch is that we don't really see this being a problem. Jobs are, or at least we hope they are more robust than before, and they clean up properly after themselves. I was actually ignorant of these classes until you brought it up, so going by: my vote is we flip it to TenantReadOnly. |
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 20 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miretskiy)
pkg/kv/kvserver/protectedts/ptreconcile/reconciler.go, line 34 at r1 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
why would this be tenant writable? Are we okay w/ tenant disabling this? Or setting it to 1ns?
I agree with tenant read-only
pkg/sql/exec_util.go, line 1203 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
This is carry forward from PTS v1, so I'm unsure of what the reasoning was when that was written. @ajwerner might remember, but my guess would be some sort of dep cycle.
Just didn't feel moved to abstract it into an interface because it was just a thing getting kicked off by the server that had no methods anybody used. Now that there are methods. Honestly, what are your thoughts on constructing this in the job rather than plumbing it down here? I think that upholds the same principle. I feel like for the spanconfig stuff, the plumbing is only justified by the weight of the dependency injection it'd take to construct such a thing in the job.
Is there any part in particular you suggest? I thought the interesting part was the fact that we use the auto span config job as our singleton for which I already have a short blurb here. |
Good point, I think this is doable since execCtx has everything we need. The only piece I'm unsure about is how to add the corresponding reconciler metric struct, if we start init'ing the reconciler in the span config job Resume hook. We could plumb the metrics registry down, but is that any better than plumbing the one-time init'ed reconciler? |
ajwerner
left a comment
There was a problem hiding this comment.
That's a sufficiently legit reason to keep constructing it in the server. Consider making an interface for the thing in protectedts?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miretskiy)
80e6af1 to
163be93
Compare
| // Metrics encapsulates the metrics exported by the reconciler. | ||
| type Metrics struct { | ||
| ReconcilationRuns *metric.Counter | ||
| RecordsProcessed *metric.Counter | ||
| RecordsRemoved *metric.Counter | ||
| ReconciliationErrors *metric.Counter | ||
| } | ||
|
|
||
| var _ metric.Struct = (*Metrics)(nil) | ||
|
|
||
| // MetricStruct makes Metrics a metric.Struct. | ||
| func (m *Metrics) MetricStruct() {} |
There was a problem hiding this comment.
This part doesn't need to move, the server still holds on to the concrete struct.
There was a problem hiding this comment.
Yup, moved it back and exported the constructor instead.
| // configured StatusFunc. | ||
| StartReconciler(ctx context.Context, stopper *stop.Stopper) error | ||
| // Metrics returns the metrics exported by the reconciler. | ||
| Metrics() *Metrics |
There was a problem hiding this comment.
I don't think you need to export this.
There was a problem hiding this comment.
Since we want to subsume the Reconciler interface in the Provider interface, the initialization of the reconciler needs to happen https://github.com/cockroachdb/cockroach/pull/75688/files#diff-47ad9f8fa32a4e686099e2e89dad4beca96772073e042d746b771bc58ffb9f01R55. So we don't have a handle to the concrete type of the reconciler in server/tenant.go to grab its metrics struct. I removed this method and exported the constructor of the Metrics struct instead. This is then plumbed into ptprovider.New().
163be93 to
8549d99
Compare
ajwerner
left a comment
There was a problem hiding this comment.
One more round and I'll be happy
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @arulajmani, and @miretskiy)
pkg/kv/kvserver/protectedts/protectedts.go, line 165 at r2 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
Since we want to subsume the Reconciler interface in the Provider interface, the initialization of the reconciler needs to happen https://github.com/cockroachdb/cockroach/pull/75688/files#diff-47ad9f8fa32a4e686099e2e89dad4beca96772073e042d746b771bc58ffb9f01R55. So we don't have a handle to the concrete type of the reconciler in server/tenant.go to grab its metrics struct. I removed this method and exported the constructor of the Metrics struct instead. This is then plumbed into ptprovider.New().
This is getting nit-picky, but I'd rather you just gave the Provider a Metrics() method which returned metric.MetricStruct as opposed to plumbing the metrics into the provider. That gives you a seamless way to add more metrics later in the provider without touching server code by potentially nesting the reconciler metric struct in some other metric struct later.
8549d99 to
158f05c
Compare
I like it, thanks for the suggestion, done. |
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @arulajmani, and @miretskiy)
pkg/kv/kvserver/protectedts/ptprovider/provider.go, line 47 at r4 (raw file):
protectedts.Cache protectedts.Reconciler metric.Struct
Heh one more, instead of this, can we have a method Metrics() that returns the metric.Struct and delegates to ptreconcile.Metrics()
pkg/kv/kvserver/protectedts/ptreconcile/metrics.go, line 27 at r4 (raw file):
func makeMetrics() *Metrics { return &Metrics{
I think you can now revert this
pkg/kv/kvserver/protectedts/ptreconcile/reconciler.go, line 86 at r4 (raw file):
// Metrics returns the reconciler's metrics. func (r *Reconciler) Metrics() *Metrics { return r.metrics
just take the address like it had before
Hmm do you mean something like: That would need us to expose |
In my head the |
158f05c to
a4bf9c0
Compare
The ptreconciler is responisble for periodically scanning the `system.pts_records` table, and checking if there are any stale records to be released, based on callbacks registerd on server startup. Previously, the ptreconciler used the meta1leaseholder to ensure that there was only one instance of it running in the cluster. Additionally, it was reliant on the ptcache to iterate over the records when checking whether they were stale. In the multi-tenant version of the protected timestamp subsystem, the SQL pod running the reconciler cannot use the meta1leaseholder to determine whether or not it should run the reconciliation loop. To get around this, we move the `Start` of the ptreconciler to the Resume hook of the auto span config job. We are guaranteed via the spanconfig manager, that there will always be atmost one instance of this job in a cluster. Further, this is a forever running job, and so we can tie the execution of the ptreconciler to the lifetime of the spanconfig job resumer. Additionally, since we will be doing away with the ptcache, we switchover to doing a full table scan every time the reconciliation loop is run. While not ideal, this is not alarming since we have a conservative limit on the total size of all records that can be stored in the table, and reconciliation only runs once every 5mins by default. Additionally, we do not expect many concurrent BACKUP/CDC jobs to exist in the cluster at a given point in time. This change also refactors some of the server and tenant code to plumb a ptreconciler to the ExecutorConfig, for use by the auto span config job. We move the relevant job+schedule tests into a ccl pacakge to allow testing from within a secondary tenant. Informs: cockroachdb#73727 Release note: None
a4bf9c0 to
500d190
Compare
|
I've seen this bazel timeout on other PRs, pretty sure it is unrelated. Started an internal thread to discuss, but going ahead with a merge here. TFTR! |
|
Build failed (retrying...): |
|
bors r+ |
|
Already running a review |
|
bors r+ |
|
Already running a review |
|
Build failed (retrying...): |
|
Build succeeded: |
The ptreconciler is responisble for periodically scanning the
system.pts_recordstable, and checking if there are any stalerecords to be released, based on callbacks registerd on server
startup. Previously, the ptreconciler used the meta1leaseholder
to ensure that there was only one instance of it running in the
cluster. Additionally, it was reliant on the ptcache to iterate
over the records when checking whether they were stale.
In the multi-tenant version of the protected timestamp subsystem,
the SQL pod running the reconciler cannot use the meta1leaseholder
to determine whether or not it should run the reconciliation loop.
To get around this, we move the
Startof the ptreconciler to theResume hook of the auto span config job. We are guaranteed via the
spanconfig manager, that there will always be atmost one instance of
this job in a cluster. Further, this is a forever running job, and so
we can tie the execution of the ptreconciler to the lifetime of the
spanconfig job resumer. Additionally, since we will be doing away
with the ptcache, we switchover to doing a full table scan every time
the reconciliation loop is run. While not ideal, this is not alarming
since we have a conservative limit on the total size of all records
that can be stored in the table, and reconciliation only runs once
every 5mins by default. Additionally, we do not expect many
concurrent BACKUP/CDC jobs to exist in the cluster at a given point in
time.
This change also refactors some of the server and tenant code to plumb
a ptreconciler to the ExecutorConfig, for use by the auto span config
job. We move the relevant job+schedule tests into a ccl pacakge to allow
testing from within a secondary tenant.
Informs: #73727
Release note: None