kvserver,backupccl: make export, gc subject to admission control#71109
kvserver,backupccl: make export, gc subject to admission control#71109craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
Nice! Any chance you're interested in splitting this up, to put GC and Export changes into separate commits, or maybe even separate PRs? Also I'm not quite sure about this: "The priority is set to admission.LowPri since these are background activities that should not be preferred over interactive user traffic". First, sometime But also, even when used by background jobs (backups) on row data, the priority of of individual ExportRequests is currently explicitly managed: they can start at lower priority but with a lock policy of error to cause them to be put on the back of the queue for retry, but then on retry, if sufficient (and configurable) time has elapsed since the read timestamp, they actually invoke maximum priority, since based on our user feedback, complying with retention/rpo policies is often important. |
|
Are you sure about wanting to apply admission control to GC requests, especially on low (I could see why you'd want to put them through admission control, I'm specifically commenting on ranking it lower than user traffic)? The GC queue is on a time budget and failure to perform significant amounts of work within the budget can lead to ranges beginning to backpressure as well as foreground traffic performance degradation. |
5c3be6f to
993d9e3
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
TFTRs!
Any chance you're interested in splitting this up, to put GC and Export changes into separate commits, or maybe even separate PRs?
I was hoping not to. I am possibly an outlier, but find it harder cognitively to handle a sequence of small commits in a PR (this would be 3 commits (a) the refactor in node.go and store.go, (b) the change for export (c) the change for GC).
If you have a strong opinion, I will do so.
Also I'm not quite sure about this: "The priority is set to admission.LowPri since these are background activities that should not be preferred over interactive user traffic". First, sometimeExportRequests are sent in the foreground, namely when fetching schema revisions where it is used a revision-capable, albeit harder to use, alternative to ScanRequest, and in these cases it is sometimes in a foreground, user transaction (indeed, when used for this purpose during backup planning, that txn actually tends to lock the jobs table for all new adoption and introspection, so starvation would be bad there).
I didn't realize there were cases that used a foreground user txn or that it locks the jobs table.
I've changed this to NormalPri and added a TODO(bulkio) with a summary of some of the things you said here.
Are you sure about wanting to apply admission control to GC requests, especially on low (I could see why you'd want to put them through admission control, I'm specifically commenting on ranking it lower than user traffic)? The GC queue is on a time budget and failure to perform significant amounts of work within the budget can lead to ranges beginning to backpressure as well as foreground traffic performance degradation.
@tbg I've changed this to NormalPri, and added a TODO(kv) with a summary of your comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @RaduBerinde)
|
Now that I read the diff and not just the summary: The ExportRequests being sent there, in backup_processor.go, are indeed all background; I'd be fine with either priority on them as long as we have something like the TODO you added about varying it based on retry status to be just like we do with the existing The ExportRequests that are sent in foreground user transactions are sent by storageccl/revision_reader.go's |
tbg
left a comment
There was a problem hiding this comment.
Reviewed 2 of 5 files at r1, 2 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @RaduBerinde, and @sumeerbhola)
pkg/server/node.go, line 347 at r3 (raw file):
spanConfigAccessor: spanConfigAccessor, } n.storeCfg.KVAdmissionController = n
Now store references node references store etc. Could we pass something in here that is scoped more tightly? It doesn't seem appealing to me that all of the methods sit on Node directly. Can't we carve out kvAdmissionQ and storeGrantCoords into a type that we can then pass in here?
They are now marked with AdmissionHeader_ROOT_KV, which stops them from bypassing admission control. The priority is set to admission.NormalPri since there are cases where they do need to complete in a very timely manner. There are TODOs in the integration code to make more sophisticated priority assignments and use LowPri when possible. Informs cockroachdb#65957 Release note: None
993d9e3 to
a814577
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
I'd be fine with either priority on them as long as we have something like the TODO you added about varying it based on retry status to be just like we do with the existing UserPriority header field (I wonder if we should unify these?).
Thanks. Updated the comment, including mentioning that this could be a function of the UserPriority.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @RaduBerinde, and @tbg)
pkg/server/node.go, line 347 at r3 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Now store references node references store etc. Could we pass something in here that is scoped more tightly? It doesn't seem appealing to me that all of the methods sit on
Nodedirectly. Can't we carve outkvAdmissionQandstoreGrantCoordsinto a type that we can then pass in here?
Done, by moving the implementation to kvserver in store.go.
tbg
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @RaduBerinde)
pkg/server/node.go, line 347 at r3 (raw file):
Previously, sumeerbhola wrote…
Done, by moving the implementation to
kvserverinstore.go.
Thank you! Looks good. I did not review the method impls on the assumption that they were pure code movement with the exception of accounting for the receiver name etc.
sumeerbhola
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @RaduBerinde)
pkg/server/node.go, line 347 at r3 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Thank you! Looks good. I did not review the method impls on the assumption that they were pure code movement with the exception of accounting for the receiver name etc.
Yes, they were purely code movement.
|
bors r+ |
|
Build succeeded: |
They are now marked with AdmissionHeader_ROOT_KV, which stops
them from bypassing admission control. The priority is set to
admission.NormalPri since there are cases where they do need
to complete in a very timely manner. There are TODOs in the
integration code to make more sophisticated priority
assignments and use LowPri when possible.
Informs #65957
Release note: None