fix: send upgrade action to units before adding it to bkgActions#9634
Merged
pkoutsovasilis merged 5 commits intoelastic:mainfrom Sep 10, 2025
Merged
Conversation
deac63b to
ad27fc6
Compare
…de callback to properly handle errs and upgrade details
ad27fc6 to
eb2f6a5
Compare
Contributor
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
cmacknz
reviewed
Sep 9, 2025
Member
cmacknz
left a comment
There was a problem hiding this comment.
The approach looks good to me as a way to fix this without totally rewriting everything. A couple of small comments.
Member
|
After your clarifications my only remaining comments are suggestions for readability or documentation, which I think are necessary but if you do these and get approval from someone else go ahead and merge and don't feel the need to wait for me to review again. |
|
Contributor
💛 Build succeeded, but was flaky
Failed CI StepsHistory
|
Contributor
|
@Mergifyio backport 8.18 8.19 9.0 9.1 |
Contributor
✅ Backports have been createdDetails
|
mergify bot
pushed a commit
that referenced
this pull request
Sep 10, 2025
* fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go
Merged
8 tasks
mergify bot
pushed a commit
that referenced
this pull request
Sep 10, 2025
* fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go
mergify bot
pushed a commit
that referenced
this pull request
Sep 10, 2025
* fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go
This was referenced Sep 10, 2025
Merged
mergify bot
pushed a commit
that referenced
this pull request
Sep 10, 2025
* fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go
Merged
8 tasks
pkoutsovasilis
added a commit
that referenced
this pull request
Sep 10, 2025
…ng it to bkgActions (#9859) * fix: send upgrade action to units before adding it to bkgActions (#9634) * fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go * fix: resolve conflicts --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis
added a commit
that referenced
this pull request
Sep 10, 2025
…g it to bkgActions (#9861) * fix: send upgrade action to units before adding it to bkgActions (#9634) * fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go * fix: resolve conflicts --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis
added a commit
that referenced
this pull request
Sep 10, 2025
…ng it to bkgActions (#9860) * fix: send upgrade action to units before adding it to bkgActions (#9634) * fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go * fix: resolve conflicts --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
pkoutsovasilis
added a commit
that referenced
this pull request
Sep 10, 2025
…g it to bkgActions (#9862) * fix: send upgrade action to units before adding it to bkgActions (#9634) * fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion (cherry picked from commit bb98f2a) # Conflicts: # internal/pkg/agent/application/coordinator/coordinator.go * fix: resolve conflicts --------- Co-authored-by: Panos Koutsovasilis <panos.koutsovasilis@elastic.co>
This was referenced Sep 16, 2025
intxgo
pushed a commit
to intxgo/elastic-agent
that referenced
this pull request
Sep 24, 2025
…stic#9634) * fix: refactor coordinator Upgrade to use opts and introduce pre-upgrade callback to properly handle errs and upgrade details * ci: add unit-tests * doc: add changelog fragment * fix: rename notifyUnitsOfProxiedAction to notifyUnitsOfProxiedActionFn * doc: add comment for notifyUnitsOfProxiedActionFn unit-test assertion
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




What does this PR do?
This PR refactors the
Coordinator.UpgradeAPI to use functional options (UpgradeOpt) instead of long argument lists, and introduces apreUpgradeCallback.Specifically:
Coordinator.Upgradeto make the flow more consistent (e.g. first verifying that the coordinator is not already upgrading before continuing).All business logic changes are captured here bcdba7b (+90 -29 lines changed)
Why is it important?
Currently, upgrade actions can remain stuck in
bkgActionswhen tamper protection is enabled (see #9629). This occurs because errors returned from the Endpoint action proxying are not correctly handled, leaving stale entries behind.By moving this logic into a
preUpgradeCallback, any error from proxying is now surfaced through the coordinator’s upgrade flow, which ensures thatbkgActionsare cleaned up correctly.I am not fully satisfied with these changes — they are a pragmatic fix until we perform a larger rewrite to centralize upgrade action handling in a single place. Still, they are necessary to resolve the immediate issue of stuck upgrade actions.
Checklist
./changelog/fragmentsusing the changelog toolDisruptive User Impact
None expected - this change only affects the internal proxying of upgrade actions to Endpoint and does not alter the upgrade request API.
How to test this PR locally
Related issues