*: use bootstrap.BootstrappedSystemIDChecker in tests#75464
*: use bootstrap.BootstrappedSystemIDChecker in tests#75464craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
7695066 to
108cd22
Compare
ajwerner
left a comment
There was a problem hiding this comment.
This is fine. I sort of think it might be better to plumb in an object as we might eventually make it so that we bootstrap testclusters with a store dir from an old version or from a different upgrade path, but for now this is good.
1919854 to
1ecb905
Compare
|
Sorry for the CI noise. I'm getting there. I followed up on this PR by removing the |
ajwerner
left a comment
There was a problem hiding this comment.
other than the restore questions
Reviewed 20 of 52 files at r1, 7 of 9 files at r2, 28 of 28 files at r3, 37 of 41 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @srosenberg)
pkg/ccl/backupccl/restore_job.go, line 121 at r4 (raw file):
ctx, span := tracing.ChildSpan(ctx, "WriteDescriptors") defer span.Finish() defer func() {
did anything change here other than de-denting and decorating the error? I didn't spot anything
pkg/ccl/backupccl/restore_job.go, line 756 at r4 (raw file):
for id := range details.DescriptorRewrites { if id > tempSystemDBID { tempSystemDBID = id
I wish that I understood this logic. It's going to set the tempSystemDBID to the max value of the rewrite sources? What is that?
It seems to me like this maybe just works because we assume the database is empty and so we never end up in this loop? I'm stumped by this code. @dt do you have any ideas? This was added in #55685.
I'm not blocking this PR on retaining this logic, but I do want to understand what's going on here.
pkg/ccl/backupccl/restore_job.go, line 881 at r4 (raw file):
var tempSystemDBID descpb.ID if details.DescriptorCoverage == tree.AllDescriptors { for id := range details.DescriptorRewrites {
does this have a gnarly edge case if there are no rewrites? In particular, imagine restoring an empty full cluster backup.
pkg/ccl/backupccl/restore_job.go, line 2490 at r4 (raw file):
if table.GetParentID() != tempSystemDBID { continue }
Does this definitely do nothing? Seems like this could very easily be meaningful.
Code quote:
if table.GetParentID() != tempSystemDBID {
continue
}pkg/keys/system.go, line 44 at r3 (raw file):
// MinUserDescriptorID returns the smallest possible non-system descriptor ID // after a cluster is bootstrapped. func MinUserDescriptorID(idChecker SystemIDChecker) uint32 {
Note to self: come back and caveat and/or remove this function
Code quote:
// MinUserDescriptorID returns the smallest possible non-system descriptor ID
// after a cluster is bootstrapped.
func MinUserDescriptorID(idChecker SystemIDChecker) uint32 {pkg/sql/catalog/bootstrap/metadata.go, line 409 at r3 (raw file):
// from the minimum value allowed in a simple unit test setting. func TestingUserDescID(offset uint32) uint32 { return keys.MinUserDescriptorID(BootstrappedSystemIDChecker()) + offset
in practice doesn't this end up being dumb because of the defaultdb and postgres databases? We should just bootstrap those too and get rid of the forever migration. I hate that thing. Leave it to me for a subsequent PR. Just mentioning that this isn't really right and might work by happenstance in the few use cases.
postamar
left a comment
There was a problem hiding this comment.
Thanks for the review. I provided some answers and added some comments in the code that should satisfy you.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @srosenberg)
pkg/ccl/backupccl/restore_job.go, line 121 at r4 (raw file):
Previously, ajwerner wrote…
did anything change here other than de-denting and decorating the error? I didn't spot anything
Nothing of substance changed here. The error decoration was, in fact, already there. I removed the closure because it was useless and annoying. I meant to put that in a separate commit and forgot, sorry.
pkg/ccl/backupccl/restore_job.go, line 756 at r4 (raw file):
Previously, ajwerner wrote…
I wish that I understood this logic. It's going to set the tempSystemDBID to the max value of the rewrite sources? What is that?
It seems to me like this maybe just works because we assume the database is empty and so we never end up in this loop? I'm stumped by this code. @dt do you have any ideas? This was added in #55685.
I'm not blocking this PR on retaining this logic, but I do want to understand what's going on here.
Note: for some reason reviewable is placing this comment nowhere near the code it's quoting, which pertains to the computation of tempSystemDBID when executing the restore.
It took me a while to figure out too. The restore planner pre-allocates enough descriptor IDs for each of the descriptors in the backup, plus one extra for the temporary system database in the case where it's planning full cluster restore. Thus, the temporary system database descriptor ID is the last in this pre-allocated range, which is encoded in the keys of DescriptorRewrites.
This also explains how we "know" that DescriptorRewrites isn't empty: it never is in the case of a full cluster restore, because we always at least restore the system database.
I mean, really, this ID should be stored in a separate field in the restore job details, instead of inferring it like this. I'll add a comment explaining this.
pkg/ccl/backupccl/restore_job.go, line 881 at r4 (raw file):
Previously, ajwerner wrote…
does this have a gnarly edge case if there are no rewrites? In particular, imagine restoring an empty full cluster backup.
As explained above, the if above ensures that there is at least one rewrite, that of the temp system db desc.
pkg/ccl/backupccl/restore_job.go, line 2490 at r4 (raw file):
Previously, ajwerner wrote…
Does this definitely do nothing? Seems like this could very easily be meaningful.
I'm not finding this in my current revision, and the comment seems out of place in reviewable. Is this comment stale?
pkg/keys/system.go, line 44 at r3 (raw file):
Previously, ajwerner wrote…
Note to self: come back and caveat and/or remove this function
This has since been removed.
pkg/sql/catalog/bootstrap/metadata.go, line 409 at r3 (raw file):
Previously, ajwerner wrote…
in practice doesn't this end up being dumb because of the
defaultdbandpostgresdatabases? We should just bootstrap those too and get rid of the forever migration. I hate that thing. Leave it to me for a subsequent PR. Just mentioning that this isn't really right and might work by happenstance in the few use cases.
I certainly have no objection!
1ecb905 to
08d49ca
Compare
Instead of relying on hard-coded constants, tests now rely on the bootstrapped system schema to determine which IDs belong to system tables or not. Release note: None
08d49ca to
424d476
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @srosenberg)
pkg/ccl/backupccl/restore_job.go, line 2490 at r4 (raw file):
Previously, postamar (Marius Posta) wrote…
I'm not finding this in my current revision, and the comment seems out of place in reviewable. Is this comment stale?
Well that's exactly it, you removed this logic, but it's not clear that this logic was doing nothing. It was checking whether the table had its parentID set to the tempSystemDBID. It seem possible that there was a code path that called this with tables in the slice that had some other parent set and thus should be skipped.
pkg/keys/system.go, line 44 at r3 (raw file):
Previously, postamar (Marius Posta) wrote…
This has since been removed.
💯
postamar
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @srosenberg)
pkg/ccl/backupccl/restore_job.go, line 2490 at r4 (raw file):
Previously, ajwerner wrote…
Well that's exactly it, you removed this logic, but it's not clear that this logic was doing nothing. It was checking whether the table had its parentID set to the tempSystemDBID. It seem possible that there was a code path that called this with tables in the slice that had some other parent set and thus should be skipped.
I misunderstood, sorry. I'll put it back. Chesterton's fence and all that.
Code quote:
if table.GetParentID() != tempSystemDBID {
continue
}Now that the non-test use cases of the keys.SystemIDChecker oracle have been clearly identified, these have been replaced with something equivalent which doesn't require such an oracle. In so doing, we reintroduce the keys.MaxReservedDescID = 49 constant, as this magic value still has a special significance when it comes to splitting ranges. It also serves as a lower bound on the IDs of non-system descriptors. Release note: None
424d476 to
9246c20
Compare
|
bors r+ |
|
Build succeeded: |
Instead of relying on hard-coded constants, tests now rely on the
bootstrapped system schema to determine which IDs belong to system
tables or not.
Release note: None