fix: error checking and prune bool check upon HasPruned and removed some dead code by Skarlso · Pull Request #1897 · open-component-model/open-component-model

Skarlso · 2026-03-04T16:34:16Z

What this PR does / why we need it

When all objects were deleted at the same time, there was an error check to see if the specific error was a deletion error or not.

This was checked with errors.Is. errors.Is however, is also checking for the message being there and it was comparing with an empty error. Another options would be to implement Is on the error, but that's literally just reimplementing As.

Here, we want to make sure that whatever is coming back is the Type of the error and not Is that error.

Further, the HasPruned's check was actually inverted. We wanted to do the thing when it actually HasPruned. And not when it hasn't.

Lastely, removed some dead code and added

	if component.Status.Component.RepositorySpec == nil {

which is actually a fix for a masked NIL panic that happened further below.

Which issue(s) this PR fixes

Testing

How to test the changes

Verification

I have tested the changes locally by running ocm

Summary by CodeRabbit

Bug Fixes
- Improved detection and messaging for unavailable resources.
- Added nil-safety checks to prevent failures when resource specifications are missing.
Improvements
- Better reconciliation progress signaling to reduce unnecessary requeues during cleanup.
- More precise error handling to distinguish transient vs. final unavailability, improving stability and observability.

coderabbitai · 2026-03-04T16:34:39Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Reconciler error handling switched from errors.Is to errors.As for NotReadyError/DeletionError; Deployer controller APIs Untrack and pruneWithApplySet now return (bool, error) to signal async completion; resource controller adds a nil-check for RepositorySpec and returns not-ready when absent.

Changes

Cohort / File(s)	Summary
Component & Resource Error Handling `kubernetes/controller/internal/controller/component/component_controller.go`, `kubernetes/controller/internal/controller/resource/resource_controller.go`	Replace `errors.Is` checks with `errors.As` for `util.NotReadyError` / `util.DeletionError`; update log messages to generic "repository is not available" / "resource is not available" and return `ctrl.Result{}, nil` on those errors.
Resource RepositorySpec Guard `kubernetes/controller/internal/controller/resource/resource_controller.go`	Add nil-check for `component.Status.Component.RepositorySpec`; mark resource NotReady with a descriptive message and return to avoid using a nil repository spec.
Deployer Reconciliation API & Flow `kubernetes/controller/internal/controller/deployer/deployer_controller.go`	Change `Untrack` and `pruneWithApplySet` signatures to return `(bool, error)` indicating completion; adjust reconcile/deletion flow to act on the boolean (requeue when not done), use 1s requeue for in-progress work, and update logging and error handling accordingly.
Minor Imports / Logging `kubernetes/controller/internal/controller/deployer/deployer_controller.go`	Add `time` import; update log messages to reflect boolean-based completion semantics and progress signaling.

Sequence Diagram(s)

sequenceDiagram
    participant Reconciler as Reconciler
    participant Deployer as DeployerController
    participant WatchMgr as WatchManager
    participant K8sAPI as KubernetesAPI

    Reconciler->>Deployer: reconcile deletion (call pruneWithApplySet)
    Deployer->>WatchMgr: stop watches / send stop events
    WatchMgr->>WatchMgr: attempt async teardown
    WatchMgr-->>Deployer: return in-progress (done=false) or done=true
    Deployer-->>Reconciler: return (done, error)
    alt done=false
        Reconciler->>K8sAPI: requeue after 1s to verify completion
    else done=true
        Reconciler->>K8sAPI: remove finalizers / continue cleanup
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I nibble at bugs with a twitch and a cheer,

I swap checks for As so wrapped errors appear,
Booleans now signal when pruning is done,
Finalizers finish their slow little run,
A rabbit-approved patch—clean and sincere.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is only partially related to the main changes; it references 'prune bool check upon HasPruned' but the primary changes involve error-type detection refactoring and completion-tracking logic in multiple files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@kubernetes/controller/internal/controller/resource/resource_controller.go`:
- Around line 266-270: Fix the typo in the status message passed to
status.MarkNotReady: change "repository spec in component status most not be
nil" to "repository spec in component status must not be nil" where
component.Status.Component.RepositorySpec is checked and
status.MarkNotReady(resource, v1alpha1.ResourceIsNotAvailable, ...) is called
(in the same block that returns fmt.Errorf about repository spec being nil for
component.Name).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b0bac26-a2ca-4d2b-aeab-e1af395211d1

📥 Commits

Reviewing files that changed from the base of the PR and between 17ed3e2 and dd8ed44.

📒 Files selected for processing (3)

kubernetes/controller/internal/controller/component/component_controller.go
kubernetes/controller/internal/controller/deployer/deployer_controller.go
kubernetes/controller/internal/controller/resource/resource_controller.go

…ome dead code On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

kubernetes/controller/internal/controller/deployer/deployer_controller_test.go (2)

353-362: Same concerns as the earlier cleanup block.

This cleanup has the same issues: cluster-wide deployer listing and silently ignored update errors. Apply the same namespace scoping and error logging fix as recommended for lines 64-73.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_test.go`
around lines 353 - 362, The cleanup is listing all Deployer resources
cluster-wide and ignoring update errors; change the List call to scope to the
test namespace (use client.ListOptions{Namespace: <test namespace>} or
client.InNamespace(ctxNamespace) equivalent) so v1alpha1.DeployerList named
deployers only returns namespace resources, and check/handle the error returned
by k8sClient.Update(ctx, d) instead of discarding it—log or fail the test on
error; keep using controllerutil.RemoveFinalizer for applySetPruneFinalizer and
resourceWatchFinalizer and continue calling test.DeleteObject(ctx, k8sClient, d)
after ensuring the update succeeded.

64-73: Scope cleanup to test namespace and handle update errors.

Two concerns with this cleanup logic:

k8sClient.List(ctx, deployers) lists all deployers cluster-wide, which could interfere with parallel test runs by cleaning up deployers from other tests.
_ = k8sClient.Update(ctx, d) silently ignores update errors. If the update fails (e.g., conflict), the finalizers remain and the subsequent delete may hang indefinitely.

♻️ Proposed fix to scope listing and log update errors

 By("cleaning up deployers")
 deployers := &v1alpha1.DeployerList{}
-Expect(k8sClient.List(ctx, deployers)).To(Succeed())
+Expect(k8sClient.List(ctx, deployers, client.InNamespace(namespace.GetName()))).To(Succeed())
 for i := range deployers.Items {
     d := &deployers.Items[i]
     controllerutil.RemoveFinalizer(d, applySetPruneFinalizer)
     controllerutil.RemoveFinalizer(d, resourceWatchFinalizer)
-    _ = k8sClient.Update(ctx, d)
+    if err := k8sClient.Update(ctx, d); err != nil {
+        GinkgoLogr.Error(err, "failed to remove finalizers", "deployer", d.Name)
+    }
     test.DeleteObject(ctx, k8sClient, d)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_test.go`
around lines 64 - 73, Scope the Deployer cleanup to the test namespace and stop
ignoring update errors: when listing deployers, pass the namespace option (e.g.,
client.InNamespace(testNamespace) or the existing namespace variable) to
k8sClient.List(ctx, deployers, ...); when removing finalizers and calling
k8sClient.Update(ctx, d) capture the error and handle retries/conflicts instead
of silencing it — for example assert Expect(err).To(Succeed()) or on
apierrors.IsConflict fetch the latest object, remove the finalizers again and
retry the Update before calling test.DeleteObject; keep references to deployers,
controllerutil.RemoveFinalizer, applySetPruneFinalizer, resourceWatchFinalizer,
k8sClient.Update, and test.DeleteObject to locate the code.

kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go (1)

188-190: Scope teardown to test-owned objects only.

At Line 189 and Line 201, the test lists all Deployers/RGDs and then deletes everything found. This can cause cross-test interference in shared environments.

Suggested direction

- deployers := &v1alpha1.DeployerList{}
- Expect(k8sClient.List(ctx, deployers)).To(Succeed())
- for i := range deployers.Items { ... }
+ d := &v1alpha1.Deployer{}
+ err := k8sClient.Get(ctx, client.ObjectKey{Name: deployerObjName}, d)
+ if err == nil {
+   // remove finalizers + update + delete
+ } else {
+   Expect(errors.IsNotFound(err)).To(BeTrue())
+ }

- RGDs := &unstructured.UnstructuredList{}
- RGDs.SetGroupVersionKind(listGVK)
- Expect(k8sClient.List(ctx, RGDs)).To(Succeed())
- for i := range RGDs.Items {
-   test.DeleteObject(ctx, k8sClient, &RGDs.Items[i])
- }
+ rgdToDelete := &unstructured.Unstructured{}
+ rgdToDelete.SetGroupVersionKind(gvk)
+ rgdToDelete.SetName(rgdObj.GetName())
+ err = k8sClient.Get(ctx, client.ObjectKeyFromObject(rgdToDelete), rgdToDelete)
+ if err == nil {
+   test.DeleteObject(ctx, k8sClient, rgdToDelete)
+ } else {
+   Expect(errors.IsNotFound(err)).To(BeTrue())
+ }

Also applies to: 199-204

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go`
around lines 188 - 190, The test currently lists and deletes all Deployers/RGDs
via k8sClient.List into the deployers/rgds slices which can remove resources
from other tests; change the List calls to restrict results to test-owned
objects only by passing appropriate client.ListOptions (e.g.,
client.InNamespace(...) if tests use a dedicated namespace, or
client.MatchingLabels{"test-run": testID} if using a per-test label) and/or
filter the returned items by checking a specific label/ownerReference before
deleting; update the code around the deployers variable and the rgds list
handling (the k8sClient.List calls and subsequent deletion loop) to only operate
on items that have the test label or the expected ownerReference.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go`:
- Around line 190-195: The loop over deployers.Items currently removes
finalizers with controllerutil.RemoveFinalizer and swallows k8sClient.Update
errors; change it to detect whether RemoveFinalizer actually removed a finalizer
(e.g., by checking controllerutil.ContainsFinalizer before/after or comparing
returned bool), only call k8sClient.Update(ctx, d) when a finalizer was removed,
and replace the silent assignment `_ = k8sClient.Update(...)` with a Ginkgo
assertion (Expect(err).ToNot(HaveOccurred()) or similar) to ensure the update
succeeded before calling test.DeleteObject(ctx, k8sClient, d); keep references
to deployers.Items, controllerutil.RemoveFinalizer, k8sClient.Update, and
test.DeleteObject to locate the change.

---

Nitpick comments:
In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go`:
- Around line 188-190: The test currently lists and deletes all Deployers/RGDs
via k8sClient.List into the deployers/rgds slices which can remove resources
from other tests; change the List calls to restrict results to test-owned
objects only by passing appropriate client.ListOptions (e.g.,
client.InNamespace(...) if tests use a dedicated namespace, or
client.MatchingLabels{"test-run": testID} if using a per-test label) and/or
filter the returned items by checking a specific label/ownerReference before
deleting; update the code around the deployers variable and the rgds list
handling (the k8sClient.List calls and subsequent deletion loop) to only operate
on items that have the test label or the expected ownerReference.

In
`@kubernetes/controller/internal/controller/deployer/deployer_controller_test.go`:
- Around line 353-362: The cleanup is listing all Deployer resources
cluster-wide and ignoring update errors; change the List call to scope to the
test namespace (use client.ListOptions{Namespace: <test namespace>} or
client.InNamespace(ctxNamespace) equivalent) so v1alpha1.DeployerList named
deployers only returns namespace resources, and check/handle the error returned
by k8sClient.Update(ctx, d) instead of discarding it—log or fail the test on
error; keep using controllerutil.RemoveFinalizer for applySetPruneFinalizer and
resourceWatchFinalizer and continue calling test.DeleteObject(ctx, k8sClient, d)
after ensuring the update succeeded.
- Around line 64-73: Scope the Deployer cleanup to the test namespace and stop
ignoring update errors: when listing deployers, pass the namespace option (e.g.,
client.InNamespace(testNamespace) or the existing namespace variable) to
k8sClient.List(ctx, deployers, ...); when removing finalizers and calling
k8sClient.Update(ctx, d) capture the error and handle retries/conflicts instead
of silencing it — for example assert Expect(err).To(Succeed()) or on
apierrors.IsConflict fetch the latest object, remove the finalizers again and
retry the Update before calling test.DeleteObject; keep references to deployers,
controllerutil.RemoveFinalizer, applySetPruneFinalizer, resourceWatchFinalizer,
k8sClient.Update, and test.DeleteObject to locate the code.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5fb6f938-3e44-4980-998d-3eaba2ab0e06

📥 Commits

Reviewing files that changed from the base of the PR and between 03cc25e and 6bb9f10.

📒 Files selected for processing (2)

kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go
kubernetes/controller/internal/controller/deployer/deployer_controller_test.go

…ardown and we requeue if there are still finalizers to process On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@kubernetes/controller/internal/controller/deployer/deployer_controller.go`:
- Around line 370-379: The finalizer removal happens even when cleanup returns
an error; change the control flow in the reconciler so that
controllerutil.RemoveFinalizer is only called when the cleanup call succeeded
(no error) and, for pruneWithApplySet, when done==true; for the apply-set branch
(pruneWithApplySet) check err==nil and done==true before logging success and
removing applySetPruneFinalizer, and for the resource watch branch call Untrack
and only remove resourceWatchFinalizer when Untrack returns nil; if cleanup
returned an error append to errs and return/requeue instead of falling through
to RemoveFinalizer.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3cd7301f-471a-4974-953b-e2185dd185e2

📥 Commits

Reviewing files that changed from the base of the PR and between 45e523f and c393395.

📒 Files selected for processing (1)

kubernetes/controller/internal/controller/deployer/deployer_controller.go

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

jakobmoellerdev

how are we not adjusting any tests here?

Skarlso · 2026-03-06T07:39:54Z

how are we not adjusting any tests here?

This particular segment wasn't covered. I can add tests ofc.

why has this not been followed up on?

I don't understand the question, sorry. :D Followed up by what...?

jakobmoellerdev · 2026-03-06T07:42:40Z

This was an original comment on #1897 (comment)

Skarlso · 2026-03-09T07:22:38Z

@jakobmoellerdev Ah, okay... Sorry, so what's my action here? Should I improve upon this section some more here?

Skarlso · 2026-03-09T11:49:36Z

Talked about this. Going to see what's up with Kro and adjust accordingly.

Skarlso · 2026-03-09T14:54:42Z

Okay, so I read through the entire path section and our stuff. So, Jakob you know this better probably. :D But as far as I understand, Kro does a two step approach where it Loops on Delete until finalizers can be removed, then prunes orphaned resources which is only a single loop because it doesn't care about finalizers anymore.

As far as I understand OUR code, we do the same, but a bit differently, because we don't have topological ordering on delete flow. Adopting that would mean even more code. :)

Our prune with it's looptyloop is actually fine, because it mimics Kro's delete loop prune cycle just all in one I guess. Which works.

The only fix is really just that the logic was inverted. The rest, I believe, is okay. :)

jakobmoellerdev

discussed again that this needs a test

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

…el into fix-stuck-rgd

matthiasbruns

duration looks really odd with the ns display of 10 * 60_000_000_000

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

Skarlso · 2026-03-10T15:16:45Z

@jakobmoellerdev I tried every iteration of using https://pkg.go.dev/errors#AsType and it's ugly as HELL! :D

It's only useful if we use the actual value, otherwise it's several if statements because it returns the value and the ok I can't even use it in a switch statement.

The only way that was "slightly" better was something like this:

_, notReadyOk := errors.AsType[*util.NotReadyError](err)
_, deletionOk := errors.AsType[*util.DeletionError](err)
if notReadyOk || deletionOk {...

But even this is harder to read than the two errors.As's IMHO.

jakobmoellerdev · 2026-03-10T15:52:51Z

@jakobmoellerdev I tried every iteration of using https://pkg.go.dev/errors#AsType and it's ugly as HELL! :D

It's only useful if we use the actual value, otherwise it's several if statements because it returns the value and the ok I can't even use it in a switch statement.

The only way that was "slightly" better was something like this:
_, notReadyOk := errors.AsType[*util.NotReadyError](err)
_, deletionOk := errors.AsType[*util.DeletionError](err)
if notReadyOk || deletionOk {...
But even this is harder to read than the two errors.As's IMHO.

I see, sad to hear. but at least you bumped go for ocm, thanks <3 :D

…ome dead code (open-component-model#1897)  #### What this PR does / why we need it When all objects were deleted at the same time, there was an error check to see if the specific error was a deletion error or not. This was checked with errors.Is. `errors.Is` however, is also checking for the message being there and it was comparing with an empty error. Another options would be to implement `Is` on the error, but that's literally just reimplementing As. Here, we want to make sure that whatever is coming back is the Type of the error and not Is that error. Further, the HasPruned's check was actually inverted. We wanted to do the thing when it actually HasPruned. And not when it hasn't. Lastely, removed some dead code and added ```go if component.Status.Component.RepositorySpec == nil { ``` which is actually a fix for a masked NIL panic that happened further below. #### Which issue(s) this PR fixes  #### Testing ##### How to test the changes  ##### Verification - [ ] I have tested the changes locally by running `ocm`  ## Summary by CodeRabbit * **Bug Fixes** * Improved detection and messaging for unavailable resources. * Added nil-safety checks to prevent failures when resource specifications are missing. * **Improvements** * Better reconciliation progress signaling to reduce unnecessary requeues during cleanup. * More precise error handling to distinguish transient vs. final unavailability, improving stability and observability.  --------- Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com> Co-authored-by: Matthias Bruns <github@matthiasbruns.com> Co-authored-by: Jakob Möller <jakob.moeller@sap.com>

…ome dead code (open-component-model#1897)  #### What this PR does / why we need it When all objects were deleted at the same time, there was an error check to see if the specific error was a deletion error or not. This was checked with errors.Is. `errors.Is` however, is also checking for the message being there and it was comparing with an empty error. Another options would be to implement `Is` on the error, but that's literally just reimplementing As. Here, we want to make sure that whatever is coming back is the Type of the error and not Is that error. Further, the HasPruned's check was actually inverted. We wanted to do the thing when it actually HasPruned. And not when it hasn't. Lastely, removed some dead code and added ```go if component.Status.Component.RepositorySpec == nil { ``` which is actually a fix for a masked NIL panic that happened further below. #### Which issue(s) this PR fixes  #### Testing ##### How to test the changes  ##### Verification - [ ] I have tested the changes locally by running `ocm`  ## Summary by CodeRabbit * **Bug Fixes** * Improved detection and messaging for unavailable resources. * Added nil-safety checks to prevent failures when resource specifications are missing. * **Improvements** * Better reconciliation progress signaling to reduce unnecessary requeues during cleanup. * More precise error handling to distinguish transient vs. final unavailability, improving stability and observability.  --------- Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com> Co-authored-by: Matthias Bruns <github@matthiasbruns.com> Co-authored-by: Jakob Möller <jakob.moeller@sap.com> Signed-off-by: Gerald Morrison (SAP) <gerald.morrison@sap.com>

Skarlso requested a review from a team as a code owner March 4, 2026 16:34

github-actions Bot added the kind/bugfix Bug label Mar 4, 2026

github-actions Bot added the size/s Small label Mar 4, 2026

coderabbitai Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread kubernetes/controller/internal/controller/resource/resource_controller.go

fix: error checking and prune bool check upon HasPruned and removed s…

03cc25e

…ome dead code On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

Skarlso force-pushed the fix-stuck-rgd branch from dd8ed44 to 03cc25e Compare March 4, 2026 16:40

Skarlso added 2 commits March 4, 2026 18:00

fix the test

586966c

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

fix the unit test for the deployer

6bb9f10

On-behalf-of: Gergely Brautigam <gergely.brautigam@sap.com> Signed-off-by: Gergely Brautigam <182850+Skarlso@users.noreply.github.com>

Skarlso commented Mar 4, 2026

View reviewed changes

Comment thread kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go

coderabbitai Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread kubernetes/controller/internal/controller/deployer/deployer_controller_rgd_test.go Outdated

Skarlso commented Mar 4, 2026

View reviewed changes