Conversation
b614b37 to
2271fbd
Compare
|
Should we also surface this in changelog? |
done |
| ) | ||
|
|
||
| var imageActions metrics.LabeledTimer | ||
| var ImageActions metrics.LabeledTimer |
There was a problem hiding this comment.
This led me down a bit of dive. I was curious if this variable must be shared, or if it was ok to create a new instance in each package. I noticed this was set through init() (did I mention I hate init()?), and was even considering; if this must be shared (and now exported) would it make sense to use a sync.Once and a ImageActions() function for this instead of an exported var.
But this is when The funny thing I noticed when looking how it was used elsewhere is that in some places we use init()
Lines 10 to 11 in 825635a
But in some we just define the package level variable;
Lines 18 to 23 in 825635a
Both of the above initialise a namespace with the same options (metrics.NewNamespace("engine", "daemon", nil)), which makes me consider that the variable may not have to be shared (so doesn't have to be a singleton)?
Do we know if that's correct? (in that case we may not have to export this, and just instance a namespace in each package / implementation, as long as they share the same name and subsystem? If that's not the case, then I'm wondering if some of the existing uses are .. wrong?
There was a problem hiding this comment.
I think the issue is because we have the same labeled timer name, if you try and register a second one it fails with panic: duplicate metrics collector registration attempted. See here: https://github.com/moby/moby/actions/runs/8250723611/job/22566071811?pr=47555#step:6:1853
There was a problem hiding this comment.
Ah, thanks! Yeah, I guess that makes sense, because in that case we need the same?
I think exporting the var is fine for now; we could add a comment describing why it's exported (to share between the implementations, as they share some code), and perhaps a disclaimer in it "not intended for external consumption for other purposes"
| "github.com/distribution/reference" | ||
| "github.com/docker/docker/api/types/events" | ||
| "github.com/docker/docker/api/types/registry" | ||
| dimages "github.com/docker/docker/daemon/images" |
There was a problem hiding this comment.
A bit related to that, and I didn't really consider that's already the case, but the containerd and images packages are the two implementations we have for handling the image-store. I see there's multiple imports of the images package into the containerd implementation. Not "critical", but it feels a bit odd. Perhaps we should look at what parts are used "in common" and if that code is in the right place (or should be in a "common" package that's shared between the two if duplicating those parts is undesirable)
thaJeztah
left a comment
There was a problem hiding this comment.
LGTM
but perhaps (see my comment above) adding a comment to the exported var wouldn't hurt.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Add Prometheus counters to track image deletion success and failures. Complements the timer metrics added in moby#47555. Introduces two new metrics: - engine_daemon_image_deletes_total - engine_daemon_image_deletes_failed_total{reason} Failure reasons are categorized as: not_found, conflict, permission_denied, invalid_argument, or unknown. Signed-off-by: hiroto.toyoda <hiroto.toyoda@dena.com>
Add Prometheus counters to track image deletion attempts and failures,
following the "total + failed" pattern recommended by Prometheus best
practices and matching moby's existing HealthChecks metrics pattern.
Changes:
- Add engine_daemon_image_deletes_total (counts all deletion attempts)
- Add engine_daemon_image_deletes_failed_total{reason} (counts failures)
- Introduce CategorizeErrorReason() in internal/metrics package
- Supports both containerd and moby errdefs
- Shared across containerd and graphdriver backends
- Categorizes errors as: not_found, conflict, permission_denied,
invalid_argument, or unknown
Complements the timer metrics added in moby#47555.
Signed-off-by: hiroto.toyoda <hiroto.toyoda@dena.com>
- What I did
Sprinkled some prometheus metrics in the containerd backed image service, this implements the same events we have in the graph driver case.
Fixes #45268
Fixes #43855
- How I did it
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)