support NVMe Deallocate#1105
Merged
Merged
Conversation
Contributor
Author
|
testing: installed the ZFS changes and this Propolis on mb-1. In the linux guest VM, created a zpool on a local disk and used |
iximeow
reviewed
Apr 8, 2026
iximeow
left a comment
Member
There was a problem hiding this comment.
nice, this will be nice to have, thanks for working on this. hopefully none of the comments are too surprising!
e370d49 to
eed15c6
Compare
iximeow
approved these changes
Apr 14, 2026
iximeow
left a comment
Member
There was a problem hiding this comment.
FWIW I'm a solid +1 on this now, thanks! if you're going to add a test about DatasetManagementCmd with bogus prp1/prp2 then I'll be happy to take a look at that here too. if not (or you have surprise issues), this is a nice improvement as-is.
05eafbf to
0d67440
Compare
rmustacc
reviewed
Apr 15, 2026
This was referenced May 1, 2026
This makes propolis advertise support for the "Dataset Management" NVMe command. It uses ioctl(DKIOCFREE) to pass these requests through to local disks (FileBackend). The requests are ignored on distributed disks (Crucible).
jmpesp
added a commit
to oxidecomputer/omicron
that referenced
this pull request
May 26, 2026
Update Crucible from `7103cd3a` to `bd9a0e2a`, picking up the following PRs: - Use an explicit rev for oxidecomputer git deps (oxidecomputer/crucible#1936) - Add Clone and Deserialize to VolumeInfo et al (oxidecomputer/crucible#1935) - Update omicron/oximeter (oxidecomputer/crucible#1933) - [meta] update to drift 0.1.4 (oxidecomputer/crucible#1932) - Don't log if there is nothing to log (oxidecomputer/crucible#1930) - Add VolumeInfo (oxidecomputer/crucible#1928) - Remove bonus Volume layer (oxidecomputer/crucible#1927) - Add session and client id to panic messages (oxidecomputer/crucible#1926) - [crucible-agent-types] migrate to RFD 619 pattern (oxidecomputer/crucible#1899) - Background read-only region creation (oxidecomputer/crucible#1919) - [crucible-downstairs-repair] switch to RFD 619 pattern (oxidecomputer/crucible#1901) - [crucible-pantry] switch to RFD 619 pattern (oxidecomputer/crucible#1900) - Use separate in-memory types (oxidecomputer/crucible#1913) - Remove old field from dtrace action script (oxidecomputer/crucible#1917) - Retry data writes that return an IO error (oxidecomputer/crucible#1915) - Bump dropshot to 0.17.0 (oxidecomputer/crucible#1909) - Reject snapshot requests when read-only (oxidecomputer/crucible#1914) - update ringbuf method, fix clippy lint (oxidecomputer/crucible#1904) - bump vergen-v9 version too (oxidecomputer/crucible#1903) - update dropshot to 0.16.7, dropshot-api-manager to 0.5.2 (oxidecomputer/crucible#1851) - perf-vol.d updates (oxidecomputer/crucible#1898) - upgrade progenitor to 0.13, reqwest to 0.13 (oxidecomputer/crucible#1854) - Remove cargo nextest from github workflow, out of space (oxidecomputer/crucible#1846) - Add a test for VCR serialize/deserialize (oxidecomputer/crucible#1843) Update Propolis from `bc489ddf` to `58ab73bd`, picking up the following PRs: - Bump crucible to latest, update Omicron, use explicit revs (oxidecomputer/propolis#1141) - Add project and silo ids to VM attestation (oxidecomputer/propolis#1114) - Update escargot (oxidecomputer/propolis#1139) - Prefix shebang and mark D scripts as executable (oxidecomputer/propolis#1140) - Fix error in propolis-server README (oxidecomputer/propolis#1138) - [meta] update to drift 0.1.4 (oxidecomputer/propolis#1137) - Fix Intel CPUID leaf 4 cache topology for SMT (oxidecomputer/propolis#1002) - support NVMe Deallocate (oxidecomputer/propolis#1105) - viona: do not lose used/avail indices (oxidecomputer/propolis#1135) - viona: multiqueue device should stay multiqueue across migration (oxidecomputer/propolis#1121) - Bump crucible rev to latest (oxidecomputer/propolis#1132) - expand zerocopy IntoBytes/FromByes use in guest memory accesses (oxidecomputer/propolis#1130) - dropshot-api-manager 0.7.1 (oxidecomputer/propolis#1129) - improve slog component setting (oxidecomputer/propolis#1124) - wait for viona Poller to run before declaring device running (oxidecomputer/propolis#1118) - virtio: tolerate importing queues with adjusted size (oxidecomputer/propolis#1117) - Run viona unit tests in CI (oxidecomputer/propolis#1120) - feature gate Crucible-specific boot digest code (oxidecomputer/propolis#1119) Also: - ran `cargo update -p vergen` - removed the `reqwest012` dependency - removed `reqwest012_client` from Nexus - ran `cargo hakari generate` and `cargo hakari manage-deps` - replace use of `ProgenitorOperationRetry` with `retry_operation_while_indefinitely` - during the region replacement drive saga, consume the new `VolumeInfo` from Propolis and use that to determine when to consider a replacement done
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements #990. It makes propolis advertise support for the "Dataset Management" NVMe command. It uses
ioctl(DKIOCFREE)to pass these requests through to local disks (FileBackend). The requests are ignored on distributed disks (Crucible).Note that our devices are NVME 1.0e, which specifies that the disk "may" deallocate all provided ranges, and "shall return all zeros, all ones, or the last data written to the associated LBA". The 1.0e spec has no mechanism for telling the guest which of these semantics is actually happening. Future work may include migrating to NVME 1.1 or later, which can use the DRB (Deallocation Read Behavior) field to tell the guest whether the blocks are actually zeroed or not.
This requires the changes for https://github.com/oxidecomputer/stlouis/issues/940 which are under review here.
A few details to be aware of or provide input on:
oncsfield). I think this is OK since there is no live migration currently (even if the VM has no local disks). So each time a VM boots, it will be on a specific version of Propolis which either advertises Dataset Management, or not.block::Operation::Discardnow means to discard multiple ranges from the client-provided list, not just one range.probes::block_begin_discardused to take an offset and length, but I have changed it to take the number of ranges instead. Is this OK? Are there consumers that need to change? We could fire a probe for each range, but then there would be multiple “begin” probes for one devqid, which could be confusing because begin/complete probes would not match up. Similar forprobes::nvme_discard_enqueue.VirtualDiskStats, should we add stats for Discard? How do we monitor these? Would we need to add support in consumers?