Skip to content

improve slog component setting#1124

Merged
leftwo merged 1 commit into
masterfrom
alan/each-log-to-its-own
Apr 24, 2026
Merged

improve slog component setting#1124
leftwo merged 1 commit into
masterfrom
alan/each-log-to-its-own

Conversation

@leftwo

@leftwo leftwo commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

In slog, you can set multiple values for the same key, and slog will just append them on to the end of the log record.
However, when the log is printed, we only show the first key and value and ignore the rest.

In propolis we had pretty early set:

log.new(slog::o!("component" => "vm_state_driver"));

We pass that log down to a bunch of other places, some of which we try to set component again:

slog::o!("component" => format!("crucible-{cru_id}")),

These later settings get dropped, so we end up with most of the logs having the component vm_state_driver.

Improving things a bit with this change, we can hold on to the base log and pass that down to children whom
will set their own component while sending the vm_state_driver log along for the places that are children
of it. At least that was my best guess for where to split things out. There could be more here to do, but this
at least got the problem I was trying to solve (crucible logs) and a few others as well

new component types that I did not see before:

21:59:33.211Z INFO propolis-server (vcpu_tasks): Starting vCPU thread vcpu = 0
21:59:33.205Z INFO propolis-server (attestation-server): attestation server created (sled-agent addr [fd9a:daae:107d:104::1]:12345 
21:59:33.325Z INFO propolis-server (oximeter-producer): listening
21:59:33.326Z INFO propolis-server (request_queue): enqueued external request
21:59:33.325Z INFO propolis-server (serial task): Entered serial task

These were all hidden behind the vm_state_driver

We only get the first slog "component =" setting.
Pass the base log a bit further down the call stack so
things that want to set their own component will get a chance to.
@leftwo leftwo requested a review from iximeow April 23, 2026 23:20
@leftwo

leftwo commented Apr 23, 2026

Copy link
Copy Markdown
Contributor Author

I've attached a sample instance boot logs from before (where it's all vm_state_driver) and after

before.log
after.log

@iximeow iximeow left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that is intensely awkward, thanks for chasing this down!

@leftwo leftwo merged commit 96e772a into master Apr 24, 2026
14 checks passed
@leftwo leftwo deleted the alan/each-log-to-its-own branch April 24, 2026 00:12
jmpesp added a commit to oxidecomputer/omicron that referenced this pull request May 26, 2026
Update Crucible from `7103cd3a` to `bd9a0e2a`, picking up the following
PRs:

- Use an explicit rev for oxidecomputer git deps
(oxidecomputer/crucible#1936)
- Add Clone and Deserialize to VolumeInfo et al
(oxidecomputer/crucible#1935)
- Update omicron/oximeter (oxidecomputer/crucible#1933)
- [meta] update to drift 0.1.4 (oxidecomputer/crucible#1932)
- Don't log if there is nothing to log (oxidecomputer/crucible#1930)
- Add VolumeInfo (oxidecomputer/crucible#1928)
- Remove bonus Volume layer (oxidecomputer/crucible#1927)
- Add session and client id to panic messages
(oxidecomputer/crucible#1926)
- [crucible-agent-types] migrate to RFD 619 pattern
(oxidecomputer/crucible#1899)
- Background read-only region creation (oxidecomputer/crucible#1919)
- [crucible-downstairs-repair] switch to RFD 619 pattern
(oxidecomputer/crucible#1901)
- [crucible-pantry] switch to RFD 619 pattern
(oxidecomputer/crucible#1900)
- Use separate in-memory types (oxidecomputer/crucible#1913)
- Remove old field from dtrace action script
(oxidecomputer/crucible#1917)
- Retry data writes that return an IO error
(oxidecomputer/crucible#1915)
- Bump dropshot to 0.17.0 (oxidecomputer/crucible#1909)
- Reject snapshot requests when read-only (oxidecomputer/crucible#1914)
- update ringbuf method, fix clippy lint (oxidecomputer/crucible#1904)
- bump vergen-v9 version too (oxidecomputer/crucible#1903)
- update dropshot to 0.16.7, dropshot-api-manager to 0.5.2
(oxidecomputer/crucible#1851)
- perf-vol.d updates (oxidecomputer/crucible#1898)
- upgrade progenitor to 0.13, reqwest to 0.13
(oxidecomputer/crucible#1854)
- Remove cargo nextest from github workflow, out of space
(oxidecomputer/crucible#1846)
- Add a test for VCR serialize/deserialize (oxidecomputer/crucible#1843)

Update Propolis from `bc489ddf` to `58ab73bd`, picking up the following
PRs:

- Bump crucible to latest, update Omicron, use explicit revs
(oxidecomputer/propolis#1141)
- Add project and silo ids to VM attestation
(oxidecomputer/propolis#1114)
- Update escargot (oxidecomputer/propolis#1139)
- Prefix shebang and mark D scripts as executable
(oxidecomputer/propolis#1140)
- Fix error in propolis-server README (oxidecomputer/propolis#1138)
- [meta] update to drift 0.1.4 (oxidecomputer/propolis#1137)
- Fix Intel CPUID leaf 4 cache topology for SMT
(oxidecomputer/propolis#1002)
- support NVMe Deallocate (oxidecomputer/propolis#1105)
- viona: do not lose used/avail indices (oxidecomputer/propolis#1135)
- viona: multiqueue device should stay multiqueue across migration
(oxidecomputer/propolis#1121)
- Bump crucible rev to latest (oxidecomputer/propolis#1132)
- expand zerocopy IntoBytes/FromByes use in guest memory accesses
(oxidecomputer/propolis#1130)
- dropshot-api-manager 0.7.1 (oxidecomputer/propolis#1129)
- improve slog component setting (oxidecomputer/propolis#1124)
- wait for viona Poller to run before declaring device running
(oxidecomputer/propolis#1118)
- virtio: tolerate importing queues with adjusted size
(oxidecomputer/propolis#1117)
- Run viona unit tests in CI (oxidecomputer/propolis#1120)
- feature gate Crucible-specific boot digest code
(oxidecomputer/propolis#1119)

Also:

- ran `cargo update -p vergen`

- removed the `reqwest012` dependency

- removed `reqwest012_client` from Nexus

- ran `cargo hakari generate` and `cargo hakari manage-deps`

- replace use of `ProgenitorOperationRetry` with
`retry_operation_while_indefinitely`

- during the region replacement drive saga, consume the new `VolumeInfo`
from Propolis and use that to determine when to consider a replacement
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants