Add inject_gomaxprocs config (dynamic calculation approach)#9844
Conversation
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds configurable GOMAXPROCS injection: new runtime config and CLI flag, per-pod opt-out annotation, workload-partition detection, CPU-share-based calculation and conditional injection during container creation, plus unit and integration tests and updated shell completions and docs. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Server as ContainerCreate
participant Config as RuntimeConfig
participant Sandbox as Sandbox
participant Spec as OCI_Spec
Client->>Server: Create container request
Server->>Config: Read InjectGOMAXPROCS
alt Inject disabled (<=0)
Server->>Spec: Proceed without injection
else Inject enabled (>0)
Server->>Sandbox: Fetch annotations & cgroup parent
Sandbox-->>Server: Return annotations, cgroup info
Server->>Server: Check skip-gomaxprocs annotation / workload partition
alt Skip present or workload-partitioned
Server->>Spec: Skip injection
else
Server->>Server: Determine burstable vs besteffort from cgroup parent
Server->>Server: calculateGOMAXPROCS(cpuShares, fallback)
Server->>Server: Check existing GOMAXPROCS in default/merged env
alt GOMAXPROCS already set
Server->>Spec: Preserve existing value
else
Server->>Spec: injectGOMAXPROCS(value)
end
end
end
Spec-->>Client: Container ready
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: harche The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9844 +/- ##
===========================================
- Coverage 67.61% 56.16% -11.45%
===========================================
Files 212 212
Lines 29366 29428 +62
===========================================
- Hits 19856 16529 -3327
- Misses 7814 11291 +3477
+ Partials 1696 1608 -88 🚀 New features to boost your workflow:
|
f046fbc to
905f1bf
Compare
|
I believe we will want an annotation to skip injecting GOMAXPROCS for a specific pod, unless it already exists via the pod yaml. |
| } | ||
| } | ||
|
|
||
| log.Infof(ctx, "Injecting GOMAXPROCS=%d", maxProcs) |
There was a problem hiding this comment.
This is probably in a hot path. We are not logging the name of the pod either. Perhaps delete this line?
6e475f7 to
af1fa10
Compare
af1fa10 to
8a56d93
Compare
8a56d93 to
c3d598e
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/config/config.go`:
- Around line 401-410: Add a runtime validation that rejects negative values for
InjectGOMAXPROCS: in the central config validation function (where other config
fields are validated) check that InjectGOMAXPROCS >= 0 and return a clear error
(e.g., "inject_gomaxprocs must be >= 0") if it is negative so misconfiguration
is surfaced at startup rather than silently treated as disabled.
In `@test/inject_gomaxprocs.bats`:
- Around line 17-22: The fixtures in test/inject_gomaxprocs.bats use capitalized
QoS names which don't match the lowercase checks in
setupContainerEnvironmentAndWorkdir and sb.CgroupParent(); update the jq
assignments to set .linux.cgroup_parent to lowercase "burstable-pod_test.slice"
and "besteffort-pod_test.slice" (apply same change to the other
burstable/best-effort fixtures referenced) so the auto-injection and
preservation branches are exercised, and also fix the cpu_shares comment/value
mismatch by making the comment and the jq-ed value consistent (either change the
comment to 8192 or set .linux.resources.cpu_shares to 2048) so the fixture
intent matches its data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8f69f363-de58-4d39-9e6e-109c1cf9cc09
📒 Files selected for processing (12)
completions/bash/criocompletions/fish/crio.fishcompletions/zsh/_criodocs/crio.8.mdinternal/criocli/criocli.gopkg/annotations/v2/annotations.gopkg/config/config.gopkg/config/template.gopkg/config/workloads.goserver/container_create.goserver/container_create_gomaxprocs_test.gotest/inject_gomaxprocs.bats
c3d598e to
bf4a504
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/container_create.go (1)
1031-1057: Consider removing or reducing verbosity of skip logs in the container creation hot path.These
log.Infofcalls (lines 1041, 1043) execute during every container creation. For skip conditions that may fire frequently (e.g., all guaranteed pods, all workload-partitioned pods), this could add noticeable logging overhead. Additionally, the log messages don't include the pod/container name for debugging context.Consider either removing these logs, changing them to
log.Debugf, or adding container/pod identifiers.♻️ Suggested change: Use Debugf with container context
if skipAnnotation == "true" { - log.Infof(ctx, "Skipping GOMAXPROCS injection: %s annotation is set", v2.SkipGoMaxProcs) + log.Debugf(ctx, "Skipping GOMAXPROCS injection for container %s: %s annotation is set", containerConfig.GetMetadata().GetName(), v2.SkipGoMaxProcs) } else if s.config.Workloads.IsWorkloadPartitioned(annotations) { - log.Infof(ctx, "Skipping GOMAXPROCS injection: pod is workload-partitioned") + log.Debugf(ctx, "Skipping GOMAXPROCS injection for container %s: pod is workload-partitioned", containerConfig.GetMetadata().GetName()) } else if cgroupParent := sb.CgroupParent(); strings.Contains(cgroupParent, "burstable") || strings.Contains(cgroupParent, "besteffort") {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/container_create.go` around lines 1031 - 1057, The two log.Infof calls that run on every container create (the GOMAXPROCS skip paths) should be made lower-verbosity and include container context: change them to log.Debugf (or remove) and add identifying fields such as the sandbox/container name and ID from the sandbox object used here (e.g., use sb.Name() and sb.ID() or equivalent) so the messages are only emitted at debug level and contain pod/container context; update the messages in the branch checks that reference v2.SkipGoMaxProcs and s.config.Workloads.IsWorkloadPartitioned accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@server/container_create.go`:
- Around line 1031-1057: The two log.Infof calls that run on every container
create (the GOMAXPROCS skip paths) should be made lower-verbosity and include
container context: change them to log.Debugf (or remove) and add identifying
fields such as the sandbox/container name and ID from the sandbox object used
here (e.g., use sb.Name() and sb.ID() or equivalent) so the messages are only
emitted at debug level and contain pod/container context; update the messages in
the branch checks that reference v2.SkipGoMaxProcs and
s.config.Workloads.IsWorkloadPartitioned accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8a74502d-c594-48ab-a211-955c58dcc904
📒 Files selected for processing (13)
completions/bash/criocompletions/fish/crio.fishcompletions/zsh/_criodocs/crio.8.mddocs/crio.conf.5.mdinternal/criocli/criocli.gopkg/annotations/v2/annotations.gopkg/config/config.gopkg/config/template.gopkg/config/workloads.goserver/container_create.goserver/container_create_gomaxprocs_test.gotest/inject_gomaxprocs.bats
✅ Files skipped from review due to trivial changes (5)
- completions/bash/crio
- docs/crio.8.md
- docs/crio.conf.5.md
- test/inject_gomaxprocs.bats
- server/container_create_gomaxprocs_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/config/workloads.go
- completions/fish/crio.fish
- completions/zsh/_crio
- pkg/config/config.go
- pkg/annotations/v2/annotations.go
- pkg/config/template.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
server/container_create.go (1)
1044-1055: Avoid keying pod QoS detection offcgroupParenttext.
sb.CgroupParent()is just the raw cgroup-manager output, so thestrings.Contains(..., "burstable")/"besteffort"gate bakes path formatting into the feature. This will be much easier to maintain if the QoS decision comes from stable pod data, or at least from a dedicated helper that owns the cgroup-path parsing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/container_create.go` around lines 1044 - 1055, The QoS detection should not rely on raw cgroup path string checks on sb.CgroupParent(); replace the strings.Contains(...) gate with a stable source or a dedicated parser: obtain the pod QoS from the pod metadata (or add a small helper like getPodQoSFromCgroup(cgroupPath string) that centralizes and tests path parsing) and base the branch on that QoS value before calling calculateGOMAXPROCS and injectGOMAXPROCS; update the conditional around sb.CgroupParent(), use the helper or pod-level QoS field, and keep calculateGOMAXPROCS(specgen, ...) and injectGOMAXPROCS(specgen, envs, s.ContainerServer.Config().DefaultEnv, maxProcs) calls unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/criocli/criocli.go`:
- Around line 1338-1343: The flag definition for the cli.Int64Flag named
"inject-gomaxprocs" currently only documents the best-effort fallback, but the
value is also used as a minimum floor for burstable pods; update the Usage
string on the "inject-gomaxprocs" flag (cli.Int64Flag) to explicitly state that
the value is used as both the fallback for best-effort pods and as the minimum
floor for burstable pods, and note that setting 0 disables injection so users
don't misconfigure it.
---
Nitpick comments:
In `@server/container_create.go`:
- Around line 1044-1055: The QoS detection should not rely on raw cgroup path
string checks on sb.CgroupParent(); replace the strings.Contains(...) gate with
a stable source or a dedicated parser: obtain the pod QoS from the pod metadata
(or add a small helper like getPodQoSFromCgroup(cgroupPath string) that
centralizes and tests path parsing) and base the branch on that QoS value before
calling calculateGOMAXPROCS and injectGOMAXPROCS; update the conditional around
sb.CgroupParent(), use the helper or pod-level QoS field, and keep
calculateGOMAXPROCS(specgen, ...) and injectGOMAXPROCS(specgen, envs,
s.ContainerServer.Config().DefaultEnv, maxProcs) calls unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9c3d2e37-8872-45a1-aa96-db34532a4cfd
📒 Files selected for processing (13)
completions/bash/criocompletions/fish/crio.fishcompletions/zsh/_criodocs/crio.8.mddocs/crio.conf.5.mdinternal/criocli/criocli.gopkg/annotations/v2/annotations.gopkg/config/config.gopkg/config/template.gopkg/config/workloads.goserver/container_create.goserver/container_create_gomaxprocs_test.gotest/inject_gomaxprocs.bats
✅ Files skipped from review due to trivial changes (7)
- completions/fish/crio.fish
- completions/zsh/_crio
- completions/bash/crio
- docs/crio.8.md
- docs/crio.conf.5.md
- server/container_create_gomaxprocs_test.go
- pkg/config/config.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/config/workloads.go
- pkg/annotations/v2/annotations.go
- pkg/config/template.go
- test/inject_gomaxprocs.bats
| log.Infof(ctx, "Skipping GOMAXPROCS injection: %s annotation is set", v2.SkipGoMaxProcs) | ||
| } else if s.config.Workloads.IsWorkloadPartitioned(annotations) { | ||
| log.Infof(ctx, "Skipping GOMAXPROCS injection: pod is workload-partitioned") | ||
| } else if cgroupParent := sb.CgroupParent(); strings.Contains(cgroupParent, "burstable") || strings.Contains(cgroupParent, "besteffort") { |
There was a problem hiding this comment.
One more edge case is when cpu limit == cpu request but memory limit != memory request. In that case the pod won't get exclusive cpus via cpu manager static policy and it will go to burstable slice. For Go > 1.25, it looks at cpu limit to adjust GOMAXPROCS. We should check for cpu limit being set and skip in that scenario for burstable pods.
4efa766 to
474093e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/criocli/criocli.go (1)
1338-1343:⚠️ Potential issue | 🟡 MinorMention the burstable floor in the flag help.
The runtime also uses this value as the minimum floor for burstable pods, so the current help text under-describes the behavior.
✏️ Suggested wording
&cli.Int64Flag{ Name: "inject-gomaxprocs", Value: defConf.InjectGOMAXPROCS, - Usage: "Enable GOMAXPROCS injection. Burstable pods auto-calculate from CPU request. This value is the fallback for best-effort pods. 0 to disable.", + Usage: "Enable GOMAXPROCS injection. Burstable pods auto-calculate from CPU request and use this value as a minimum floor; best-effort pods use it directly. 0 to disable.", EnvVars: []string{"CONTAINER_INJECT_GOMAXPROCS"}, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/criocli/criocli.go` around lines 1338 - 1343, Update the help/usage string for the cli.Int64Flag with Name "inject-gomaxprocs" (currently using defConf.InjectGOMAXPROCS) to mention that this value is also used as the minimum floor for burstable pods (in addition to being the fallback for best-effort pods), so users understand it affects burstable pod calculation; edit the Usage text in the cli.Int64Flag block to clearly state both behaviors and how to disable (0).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/container_create.go`:
- Around line 1044-1053: Reshape the nested ifs in the cgroup check to be more
lint-friendly: combine the cgroupParent retrieval and contains checks into one
if (if cgroupParent := sb.CgroupParent(); strings.Contains(...) ||
strings.Contains(...)) and then inline the linux/resources checks into that
block, declaring shares and cpuQuota right where you set them from
linux.GetResources() (use a single if resources := linux.GetResources();
resources != nil { shares = resources.GetCpuShares(); cpuQuota =
resources.GetCpuQuota() }). This removes unnecessary blank lines and nesting
around sb.CgroupParent(), linux, and resources and should satisfy
golangci-lint/verify while keeping the logic for sb.CgroupParent(), linux,
resources.GetCpuShares, and resources.GetCpuQuota intact.
---
Duplicate comments:
In `@internal/criocli/criocli.go`:
- Around line 1338-1343: Update the help/usage string for the cli.Int64Flag with
Name "inject-gomaxprocs" (currently using defConf.InjectGOMAXPROCS) to mention
that this value is also used as the minimum floor for burstable pods (in
addition to being the fallback for best-effort pods), so users understand it
affects burstable pod calculation; edit the Usage text in the cli.Int64Flag
block to clearly state both behaviors and how to disable (0).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ca2a3a52-8152-437b-9e60-4e2ec17e88e7
📒 Files selected for processing (13)
completions/bash/criocompletions/fish/crio.fishcompletions/zsh/_criodocs/crio.8.mddocs/crio.conf.5.mdinternal/criocli/criocli.gopkg/annotations/v2/annotations.gopkg/config/config.gopkg/config/template.gopkg/config/workloads.goserver/container_create.goserver/container_create_gomaxprocs_test.gotest/inject_gomaxprocs.bats
✅ Files skipped from review due to trivial changes (6)
- completions/bash/crio
- completions/fish/crio.fish
- completions/zsh/_crio
- docs/crio.conf.5.md
- docs/crio.8.md
- server/container_create_gomaxprocs_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/config/workloads.go
- pkg/config/config.go
- test/inject_gomaxprocs.bats
- pkg/config/template.go
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
internal/criocli/criocli.go (1)
1338-1341:⚠️ Potential issue | 🟡 MinorDocument the burstable floor behavior here too.
The runtime uses this value as the minimum floor for burstable pods as well, not only as the best-effort fallback. Leaving the old wording here also keeps the generated man page text wrong.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/criocli/criocli.go` around lines 1338 - 1341, The flag definition for cli.Int64Flag named "inject-gomaxprocs" (using defConf.InjectGOMAXPROCS) has outdated usage text; update the Usage string to state that this value is both the fallback for best-effort pods and the minimum floor applied to burstable pods (and that 0 disables injection). Edit the Usage in the cli.Int64Flag for "inject-gomaxprocs" to mention "minimum floor for burstable pods" alongside the best-effort fallback and keep the "0 to disable" note.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/config/workloads.go`:
- Around line 171-172: The IsWorkloadPartitioned method currently treats any
activation matched by workloadGivenActivationAnnotation as "partitioned", which
incorrectly flags non-partitioning overrides (e.g., cpuquota/cpulimit or
experimental annotations in WorkloadConfig) and suppresses GOMAXPROCS injection;
update IsWorkloadPartitioned to check the real partitioning signal instead of
the broad helper—either inspect the returned WorkloadConfig for an explicit
partitioning field/property (e.g., a Partition or IsPartition boolean) or check
for the specific partition annotation key used in your annotations map; if the
broader behavior was intended, rename workloadGivenActivationAnnotation to
reflect that it matches any activation (e.g., workloadGivenActivationOrOverride)
and add a new helper (e.g., workloadIndicatesPartitioning) that only detects the
partitioning annotation so GOMAXPROCS logic can call the narrow helper.
In `@server/container_create.go`:
- Around line 1044-1059: The current GOMAXPROCS injection decision uses CPU
resources read before NRI can mutate the spec (it reads linux.GetResources()
then calls calculateGOMAXPROCS and injectGOMAXPROCS), so change the logic to
evaluate quota/shares after the final resource mutators: either move the
skip/calculation/inject block to after s.nri.createContainer(...) or re-read the
final OCI resources from specgen (e.g., inspect specgen.Linux.Resources or call
linux.GetResources() again) immediately before calling calculateGOMAXPROCS and
injectGOMAXPROCS so the decision uses the final CPU quota/cpuset values (use
resources.GetCpuQuota()/GetCpuShares()).
---
Duplicate comments:
In `@internal/criocli/criocli.go`:
- Around line 1338-1341: The flag definition for cli.Int64Flag named
"inject-gomaxprocs" (using defConf.InjectGOMAXPROCS) has outdated usage text;
update the Usage string to state that this value is both the fallback for
best-effort pods and the minimum floor applied to burstable pods (and that 0
disables injection). Edit the Usage in the cli.Int64Flag for "inject-gomaxprocs"
to mention "minimum floor for burstable pods" alongside the best-effort fallback
and keep the "0 to disable" note.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9f6135f2-272c-481b-96bb-4d685aba6476
📒 Files selected for processing (13)
completions/bash/criocompletions/fish/crio.fishcompletions/zsh/_criodocs/crio.8.mddocs/crio.conf.5.mdinternal/criocli/criocli.gopkg/annotations/v2/annotations.gopkg/config/config.gopkg/config/template.gopkg/config/workloads.goserver/container_create.goserver/container_create_gomaxprocs_test.gotest/inject_gomaxprocs.bats
✅ Files skipped from review due to trivial changes (5)
- completions/zsh/_crio
- completions/fish/crio.fish
- docs/crio.conf.5.md
- completions/bash/crio
- test/inject_gomaxprocs.bats
🚧 Files skipped from review as they are similar to previous changes (3)
- pkg/annotations/v2/annotations.go
- pkg/config/template.go
- pkg/config/config.go
| func (w Workloads) IsWorkloadPartitioned(sboxAnnotations map[string]string) bool { | ||
| return w.workloadGivenActivationAnnotation(sboxAnnotations) != nil |
There was a problem hiding this comment.
Don't treat every activated workload as partitioned.
workloadGivenActivationAnnotation(...) matches any activated workload, but WorkloadConfig here also represents non-partitioning overrides like cpuquota, cpulimit, and allowed experimental annotations. Using that as a synonym for “workload-partitioned” will suppress GOMAXPROCS injection for unrelated workloads too. Narrow this to the actual partitioning signal, or rename the helper if this broader behavior is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/config/workloads.go` around lines 171 - 172, The IsWorkloadPartitioned
method currently treats any activation matched by
workloadGivenActivationAnnotation as "partitioned", which incorrectly flags
non-partitioning overrides (e.g., cpuquota/cpulimit or experimental annotations
in WorkloadConfig) and suppresses GOMAXPROCS injection; update
IsWorkloadPartitioned to check the real partitioning signal instead of the broad
helper—either inspect the returned WorkloadConfig for an explicit partitioning
field/property (e.g., a Partition or IsPartition boolean) or check for the
specific partition annotation key used in your annotations map; if the broader
behavior was intended, rename workloadGivenActivationAnnotation to reflect that
it matches any activation (e.g., workloadGivenActivationOrOverride) and add a
new helper (e.g., workloadIndicatesPartitioning) that only detects the
partitioning annotation so GOMAXPROCS logic can call the narrow helper.
There was a problem hiding this comment.
The CRI-O Workloads feature is specifically designed for workload partitioning — the activation annotation is target.workload.openshift.io/management, which is exclusively used for cpuset-based workload partitioning in OpenShift. There is no real-world use case where a workload config is activated without partitioning. The naming and behavior are correct for how this feature is actually used.
There was a problem hiding this comment.
@harche, thanks for the clarification — that context makes sense. Since target.workload.openshift.io/management is the sole activation annotation used in practice and it is exclusively tied to cpuset-based workload partitioning in OpenShift, the current IsWorkloadPartitioned implementation is correct and my concern doesn't apply. I'll record this for future reviews.
✏️ Learnings added
Learnt from: harche
Repo: cri-o/cri-o PR: 9844
File: pkg/config/workloads.go:171-172
Timestamp: 2026-03-27T13:03:11.501Z
Learning: In the cri-o/cri-o repository, the CRI-O `Workloads` feature (`pkg/config/workloads.go`) is exclusively used for cpuset-based workload partitioning in OpenShift. The activation annotation is `target.workload.openshift.io/management`. There is no real-world use case where a WorkloadConfig is activated without workload partitioning, so treating any activation matched by `workloadGivenActivationAnnotation` as "partitioned" (as done in `IsWorkloadPartitioned`) is correct and intentional. Do not flag this as incorrect or suggest narrowing the check.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: bitoku
Repo: cri-o/cri-o PR: 9667
File: server/container_create.go:1233-1236
Timestamp: 2025-12-17T13:38:34.646Z
Learning: In the cri-o/cri-o repository, protobuf-generated Get* methods for k8s.io/cri-api types are nil-safe: if the receiver is nil, GetX() returns the zero value instead of panicking. Do not add explicit nil checks before chaining calls on such getters. Apply this guidance to all Go code that uses these generated getters across the codebase.
Learnt from: bitoku
Repo: cri-o/cri-o PR: 9676
File: internal/lib/stats/cgroup_stats_unsupported.go:1-7
Timestamp: 2025-12-18T13:28:24.244Z
Learning: In the cri-o/cri-o repository, for platform-specific types guarded by Go build tags (for example //go:build !linux), implement empty structs for unsupported platforms to permit compilation and clearly indicate the feature is not available rather than mirroring the Linux struct with unpopulated fields. Apply this pattern to all relevant platform-specific files across the codebase (i.e., any file under build-taged sections that should compile on all targets but lacks full implementation for some platforms).
| } else if cgroupParent := sb.CgroupParent(); strings.Contains(cgroupParent, "burstable") || strings.Contains(cgroupParent, "besteffort") { | ||
| var shares, cpuQuota int64 | ||
|
|
||
| if linux != nil { | ||
| if resources := linux.GetResources(); resources != nil { | ||
| shares = resources.GetCpuShares() | ||
| cpuQuota = resources.GetCpuQuota() | ||
| } | ||
| } | ||
|
|
||
| // Skip if the container has a CPU limit (quota > 0). Go 1.25+ auto-detects | ||
| // GOMAXPROCS from the cgroup quota, so injecting would override that. | ||
| if cpuQuota <= 0 { | ||
| maxProcs := calculateGOMAXPROCS(shares, fallbackMaxProcs) | ||
|
|
||
| injectGOMAXPROCS(specgen, envs, s.ContainerServer.Config().DefaultEnv, maxProcs) |
There was a problem hiding this comment.
Make this decision from the final CPU resources.
This runs before s.nri.createContainer(...), but NRI can still rewrite quota/cpuset on specgen later in the same create flow. In that case the container keeps an injected GOMAXPROCS even though Go would have derived the right value from the final cgroup limits. Please evaluate the skip/calculate logic after the last resource mutator, or re-read the final OCI resources before injecting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/container_create.go` around lines 1044 - 1059, The current GOMAXPROCS
injection decision uses CPU resources read before NRI can mutate the spec (it
reads linux.GetResources() then calls calculateGOMAXPROCS and injectGOMAXPROCS),
so change the logic to evaluate quota/shares after the final resource mutators:
either move the skip/calculation/inject block to after
s.nri.createContainer(...) or re-read the final OCI resources from specgen
(e.g., inspect specgen.Linux.Resources or call linux.GetResources() again)
immediately before calling calculateGOMAXPROCS and injectGOMAXPROCS so the
decision uses the final CPU quota/cpuset values (use
resources.GetCpuQuota()/GetCpuShares()).
There was a problem hiding this comment.
This is not a concern. The GOMAXPROCS code reads CPU shares/quota from the linux parameter (*types.LinuxContainerConfig — the CRI request from kubelet), not from specgen (the OCI spec):
cri-o/server/container_create.go
Lines 1047 to 1051 in 474093e
NRI's createContainer operates exclusively on the OCI specgen:
Lines 105 to 114 in 474093e
Since NRI mutates specgen and the GOMAXPROCS calculation reads from the CRI linux config, the ordering is irrelevant — the input data is stable regardless of when NRI runs.
There was a problem hiding this comment.
@harche, you're absolutely right — thank you for the clear explanation. The GOMAXPROCS decision reads shares/cpuQuota from the linux *types.LinuxContainerConfig parameter (the kubelet CRI request), which is immutable and cannot be touched by NRI. NRI only mutates specgen (the OCI spec). The two data sources are independent, so the ordering relative to s.nri.createContainer(...) is irrelevant. My concern was mistaken.
✏️ Learnings added
Learnt from: harche
Repo: cri-o/cri-o PR: 9844
File: server/container_create.go:1044-1059
Timestamp: 2026-03-27T13:05:07.245Z
Learning: In cri-o/cri-o server/container_create.go, the GOMAXPROCS injection logic in `setupContainerEnvironmentAndWorkdir` reads CPU shares/quota from the `linux *types.LinuxContainerConfig` parameter (the immutable CRI request from kubelet), NOT from `specgen`. NRI's `createContainer` only mutates `specgen` (the OCI spec), so NRI cannot affect the CPU resource values used for the GOMAXPROCS calculation. Do not raise concerns about NRI ordering invalidating the GOMAXPROCS decision.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: bitoku
Repo: cri-o/cri-o PR: 9667
File: server/container_create.go:1233-1236
Timestamp: 2025-12-17T13:38:34.646Z
Learning: In the cri-o/cri-o repository, protobuf-generated Get* methods for k8s.io/cri-api types are nil-safe: if the receiver is nil, GetX() returns the zero value instead of panicking. Do not add explicit nil checks before chaining calls on such getters. Apply this guidance to all Go code that uses these generated getters across the codebase.
Learnt from: bitoku
Repo: cri-o/cri-o PR: 9676
File: internal/lib/stats/cgroup_stats_unsupported.go:1-7
Timestamp: 2025-12-18T13:28:24.244Z
Learning: In the cri-o/cri-o repository, for platform-specific types guarded by Go build tags (for example //go:build !linux), implement empty structs for unsupported platforms to permit compilation and clearly indicate the feature is not available rather than mirroring the Linux struct with unpopulated fields. Apply this pattern to all relevant platform-specific files across the codebase (i.e., any file under build-taged sections that should compile on all targets but lacks full implementation for some platforms).
474093e to
d6f20f6
Compare
d6f20f6 to
b1109f2
Compare
Signed-off-by: Harshal Patil <12152047+harche@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b1109f2 to
49a75db
Compare
|
@harche: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
failed runs are not due to the changes in this PR. |
|
/hold until we test this using machine config to ensure end to end flow is covered. |
|
Closing in favor of #9860 |
Adds a new CRI-O config option
inject_gomaxprocsthat injects theGOMAXPROCS environment variable into burstable and best-effort pod
containers. For burstable pods, the value is auto-calculated from
the CPU request (via cpu_shares). The configured value acts as a
minimum floor. Guaranteed pods, workload-partitioned pods, and pods
with GOMAXPROCS already set are skipped.
Also adds a
skip-gomaxprocs.crio.iopod annotation for per-pod opt-out.Jira: OCPNODE-4221
Fixes: OCPBUGS-64621 — Containerized OpenShift components showing significant CPU spikes on systems with high core count
Summary by CodeRabbit
New Features
Documentation
Tests