Change dataset table to only hold Crucible datasets#7386
Conversation
|
|
||
| /// Database representation of a Crucible dataset's live information. | ||
| /// | ||
| /// This includes the socket address of the Crucible downstairs that owns this |
| #[derive(Debug, Default, Serialize, Deserialize)] | ||
| pub struct CrucibleResourcesV2 { | ||
| pub datasets_and_regions: Vec<(Dataset, Region)>, | ||
| pub datasets_and_regions: Vec<(CrucibleDataset, Region)>, |
There was a problem hiding this comment.
We'll need to be able to deserialize these existing blobs: will the JSON format change with the type change here? I don't think so but I've been wrong before!
There was a problem hiding this comment.
It does change but only in ways that should be compatible, I think:
- The old JSON blobs will have
zone_name,quota,reservation, andcompressionfields that we'll ignore when deserializing. - The old JSON blobs could have had
nullfor theip,port, orsize_usedfields. If they did, we will now fail to deserialize them. But those should have always been set for Crucible datasets, so should not havenull?
I was planning to take this through a "deploy R12 on a racklette, upgrade to this branch" test before merging. (Although I'd like to wait to do that until I have the next, smaller PR up, to test both at the same time.) What should I do to the racklette before upgrading to ensure I have some of these JSON blobs present?
There was a problem hiding this comment.
Ooh, maybe a better check: where is this JSON in CRDB? I can check dogfood and see if we have any that would cause problems.
There was a problem hiding this comment.
Hm well that didn't work. Assuming this is volume.resources_to_clean_up, dogfood only has empty resources for V1 and V2, and V3 looks like it only has IDs in it?
root@[fd00:1122:3344:105::3]:32221/omicron> select resources_to_clean_up from volume where resources_to_clean_up is not null and resources_to_clean_up like '{"V2%';
resources_to_clean_up
---------------------------------------------------------------
{"V2":{"datasets_and_regions":[],"snapshots_to_delete":[]}}
{"V2":{"datasets_and_regions":[],"snapshots_to_delete":[]}}
(2 rows)
Time: 12ms total (execution 12ms / network 0ms)
root@[fd00:1122:3344:105::3]:32221/omicron> select resources_to_clean_up from volume where resources_to_clean_up is not null and resources_to_clean_up like '{"V1%';
resources_to_clean_up
------------------------------------------------------------------
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
(13 rows)
There was a problem hiding this comment.
Yeah, unfortunately in this case volume records are eventually hard deleted.
What should I do to the racklette before upgrading to ensure I have some of these JSON blobs present?
Actually, I think test_deserialize_old_crucible_resources covers this case!
| let dataset = dataset?; | ||
| let id = dataset.id(); | ||
|
|
||
| // If this dataset already exists, do nothing. |
There was a problem hiding this comment.
This looks like a behaviour change from what was here before?
There was a problem hiding this comment.
I'll make the claim it is not, even though it looks like it is. 😅
Before, we had this:
// If this dataset already exists, only update it if it appears different from what exists
// in the database already.
let action = if let Some(db_dataset) = existing_datasets.remove(&id) {
let db_config: DatasetConfig = db_dataset.try_into()?;
let bp_config: DatasetConfig = bp_dataset.clone().try_into()?;
if db_config == bp_config {
num_unchanged += 1;
continue;
}When flattening a db_dataset down to a DatasetConfig to see if there are any relevant changes, we were throwing away ip, port, and size_used, only leaving us with information derived from pool_id, zone_name, quota, reservation, and compression:
impl TryFrom<Dataset> for omicron_common::disk::DatasetConfig {
type Error = Error;
fn try_from(dataset: Dataset) -> Result<Self, Self::Error> {
let compression = if let Some(c) = dataset.compression {
c.parse().map_err(|e: anyhow::Error| {
Error::internal_error(&e.to_string())
})?
} else {
omicron_common::disk::CompressionAlgorithm::Off
};
Ok(Self {
id: dataset.identity.id.into(),
name: omicron_common::disk::DatasetName::new(
omicron_common::zpool_name::ZpoolName::new_external(
ZpoolUuid::from_untyped_uuid(dataset.pool_id),
),
dataset.kind.try_into_api(dataset.zone_name)?,
),
inner: omicron_common::disk::SharedDatasetConfig {
quota: dataset.quota.map(|q| q.into()),
reservation: dataset.reservation.map(|r| r.into()),
compression,
},
})
}
}We dropped all the those fields we were comparing except pool_id. So the equivalent code here would be something like:
if existing_dataset.pool_id == dataset.pool_id {
num_unchanged += 1;
continue;
}but I don't believe it's possible for a dataset's pool ID to change, right? So I dropped what would look like a weird / spurious check and replaced it with just the "have we inserted this already" check that's here now.
If it is possible for a dataset's pool ID to change, the new code is a behavior change and is wrong.
A separate question is: is it possible for a dataset's IP or port to change? If so, this code is wrong, but so was the old code.
There was a problem hiding this comment.
We dropped all the those fields we were comparing except pool_id.
I don't follow this part.
I don't believe it's possible for a dataset's pool ID to change, right?
I believe the same thing haha
There was a problem hiding this comment.
We dropped all the those fields we were comparing except pool_id.
I don't follow this part.
The old code for deciding whether to try to upsert was this:
// If this dataset already exists, only update it if it appears different from what exists
// in the database already.
let action = if let Some(db_dataset) = existing_datasets.remove(&id) {
let db_config: DatasetConfig = db_dataset.try_into()?;
let bp_config: DatasetConfig = bp_dataset.clone().try_into()?;
if db_config == bp_config {
num_unchanged += 1;
continue;
}Those DatasetConfig structs are only comparing a subset of fields: pool_id, zone_name, quota, reservation, and compression. Of all those fields, the new CrucibleDataset only has pool_id. So the behaviorally-equivalent code would be to just compare "does the blueprint dataset pool_id match the database dataset pool_id", but that's a silly check to make when we know it can't change.
| if !matches!(zone.zone_type, BlueprintZoneType::Crucible(_)) { | ||
| return None; | ||
| } | ||
| let dataset = zone |
There was a problem hiding this comment.
is the type here CrucibleDataset?
There was a problem hiding this comment.
No, it's DurableDataset, which contains a reference to an OmicronZoneDataset:
omicron/nexus/types/src/deployment/zone_type.rs
Lines 237 to 241 in db656d7
| // datasets. | ||
| // | ||
| // It should only insert these new zones. | ||
| let new_zones = [ |
There was a problem hiding this comment.
This seems wrong, though I haven't looked deeply into this test: it's inserting two of the same zone that are different types than before this diff.
There was a problem hiding this comment.
I think this is okay, but please double check me. Before, the execution code would insert dataset records for any DatasetKind since dataset held all dataset kinds; now, it only inserts records for Crucible datasets. The test is trying to ensure that new datasets added after the last execution are added, which is now restricted to "new Crucible datasets added after the last execution are added", so the new zones we use to test have to be Crucible zones.
There was a problem hiding this comment.
Before, the execution code would insert dataset records for any DatasetKind since dataset held all dataset kinds; now, it only inserts records for Crucible datasets.
I think this is why I was suspicious: based on that understanding this vec should have had four members, mixing together the crucible and non-crucible datasets.
There should be two versions of this test: one for "durable" datasets and one for crucible datasets.
There was a problem hiding this comment.
I might be missing something - does it help if all_datasets already included Crucible datasets? (Guaranteed by https://github.com/oxidecomputer/omicron/pull/7386/files/26ccf5f2e0f22721cd4d1d26bac4d9c4dc659e6b#diff-d4165f273b361528feca0c934574df6365cd86f5688ab81665f35975620183a1R312-R316, and then we pass in all_datasets.iter().chain(new_zones); that used to provide a mix of crucible and non-crucible, and now just passes in "more crucible".)
There was a problem hiding this comment.
We chatted about what made this test confusing; I took a stab at addressing it in 1201f9a
| kind omicron.public.dataset_kind NOT NULL, | ||
| /* | ||
| * Contact information for the dataset: socket address of the Crucible | ||
| * downstairs service that owns this dataset |
| set local disallow_full_table_scans = off; | ||
|
|
||
| DELETE FROM omicron.public.crucible_dataset WHERE kind != 'crucible'; |
There was a problem hiding this comment.
I was going to ask if we wanted to add a constraint that the kind must be 'crucible' before dropping, to ensure we've cleared other dataset types, but I hesitated to recommend that because "blocking the update" also seems bad.
To confirm, if there happen to be non-crucible entries for any reason, we're okay dropping them quietly?
There was a problem hiding this comment.
| (kind != 'zone') OR | ||
| (kind = 'zone' AND zone_name IS NOT NULL) | ||
| ) | ||
| size_used INT NOT NULL |
There was a problem hiding this comment.
Not your fault that this isn't labelled, but could we add: "this field is controlled by Nexus, and is updated as region allocations are performed from this dataset"
|
|
||
| /* | ||
| * A dataset of allocated space within a zpool. | ||
| * Table tracking the contact information and size used by datasets associated |
There was a problem hiding this comment.
Nit: Is it worth documenting that this is a rendezvous table?
| ip -> Inet, | ||
| port -> Int4, |
There was a problem hiding this comment.
nice, thanks for making these non-nullable, makes this a lot easier to use
| } | ||
|
|
||
| pub async fn dataset_upsert_if_blueprint_is_current_target( | ||
| pub async fn crucible_dataset_upsert_if_blueprint_is_current_target( |
There was a problem hiding this comment.
(no action needed) since this is invoked by the blueprint executor to ensure dataset records exist, I assume this can also go away when we move more fully to the rendezvous table model.
(all to say, thanks for this PR, this is a great direction)
There was a problem hiding this comment.
Yep, that's the plan for the next PR 👍
| } | ||
|
|
||
| datastore | ||
| .dataset_delete_if_blueprint_is_current_target( |
There was a problem hiding this comment.
Rad to see this unified, it felt gross as it was written (deletion of a table row happening in multiple possible spots). Thanks again for this change.
|
Notes from testing this PR along with its successor #7397 on |
…r RPW to rendezvous RPW (#7397) This PR builds on #7386. I'll test both of them together on a racklette (will put results in a comment on this PR). This PR contributes to addressing (but doesn't fully address) a couple issues: * #7392: This PR adds useful output for the `crucible_dataset` rendezvous table, but we still need useful output for `debug_dataset` * #6999: This PR removes a step from execution that currently depends on successful completion of earlier step to post datasets to sled-agents, since we now confirm insertions are okay via inventory collection I'm inclined to say this PR fully addresses a couple other issues, but it's debatable: * #7210: Are we okay with IP and port still being here now that this table is just crucible datasets? (IMO it's still conceptually weird to say a dataset has an IP/port, but maybe it's not as bad now?) * #6998 `dataset` as a general table is now gone and replaced with more specific rendezvous tables, so I think this one can be closed?
The diff of this PR is large but very mechanical. The actual changes here are:
datasettable has been renamedcrucible_dataset(and correspondingly, theDatasetmodel struct is nowCrucibleDataset, andDatastore::dataset_*methods are nowDatastore::crucible_dataset_*)zone_namecolumn has been dropped (it was always NULL for Crucible datasets, enforced by aCHECKconstraint)quota,reservation, andcompressioncolumns have been dropped (they were always NULL for Crucible datasets, although this was not enforced by aCHECKconstraint; this information is kept in the blueprint and is not read from this table)ip,port, andsize_usedcolumns now haveNOT NULLconstraints (they already hadNOT NULLconstraints specifically for Crucible datasets via aCHECKconstraint; correspondingly, theCrucibleDatasetmodel type has non-optional fields for these columns)and everything else is followup to account for this. Most of that followup is just renaming things. A few nontrivial exceptions:
addressby failing if it wasn't set for Crucible datasets; that error handling is now gone sinceaddressis no longer optional.datasettable using a mix of Crucible and non-Crucible datasets. I tried to update these as reasonably as I could while keeping the spirit of the test in tact, but a closer look at these bits would be good.I think this gets us most of the way toward
crucible_datasetbeing a "rendezvous table" in the RFD 540 sense. As of this PR, the blueprint executor task is still responsible for adding entries to this table, but I'll move that to the rendezvous background task in a followup.Fixes #7299. Gets us halfway to #7301 by fixing the
datasetside of that issue (bp_omicron_datasetstill needs work).