Conversation
…e and remove unused vars
…r (already handled)
|
This pull request does not have a backport label. Could you fix it @nkvoll? 🙏
|
|
From my testing, if this is the startup-state of the agent, it doesn't seem to start any components, but if configuration is edited while the agent is running, it keeps all existing components as-is. This makes me wonder if what currently happens in the liveness endpoint should be happening in the readiness endpoint instead. Worth discussing? /cc @cmacknz @blakerouse |
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
I think this "Invalid component model" model error would definitely make sense in the readiness endpoint. Agent isn't "ready to accept traffic" in this state. I don't really see a reason why the readiness and liveness endpoints can't be the same implementation, the liveness endpoint right now can just be optionally extended to detect more things. |
ycombinator
left a comment
There was a problem hiding this comment.
LGTM! Please add a CHANGELOG fragment by installing elastic-agent-changelog-tool, running elastic-agent-changelog-tool new from the root of the elastic-agent repo folder, and then editing the file that's generated. Thanks!
Since this is a bug fix, should we instead add the |
|
💚 Build Succeeded
History
cc @nkvoll |
|
@Mergifyio backport 8.18 8.19 9.0 9.1 |
✅ Backports have been createdDetails
|
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427)
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427)
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427)
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427)
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427) Co-authored-by: Njal Karevoll <njal@karevoll.no>
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427) Co-authored-by: Njal Karevoll <njal@karevoll.no>
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427) Co-authored-by: Njal Karevoll <njal@karevoll.no>
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment (cherry picked from commit d3b9427) Co-authored-by: Njal Karevoll <njal@karevoll.no>
* upstream: (26 commits) fix: ensure EDOT subprocess shuts down gracefully on agent termination (#9886) [main][Automation] Update versions (#9976) Add Collector reference docs and automation (#9953) [beatreceivers] Integrate beatsauthextension (#9257) [main][Automation] Update versions (#9941) Update OTel components to v0.132.0/v1.38.0 (#9954) Enhancement/5235 wrap errors when marking upgrade (#9366) Mount Go build cache into crossbuild container (#9094) Liveness agent state (#9673) [main][Automation] Bump VM Image version to 1757725254 (#9942) Enhancement/5235 correctly wrap errors from copyActionDir and copyRunDirectory (#9349) [main][Automation] Update elastic/beats to afc53c0479ac (#9874) Add -coverpkg option when running unit test to calculate coverage across packages (#9913) Cache binaries downloaded for packaging locally (#9133) [main][Automation] Update versions (#9897) Disable flaky test TestBeatsReceiverLogs (#9891) Allow overriding AGENT_PACKAGE_VERSION and MANIFEST_URL when USE_PACKAGE_VERSION=true (#9864) add ingest-docs team as CODEOWNERS for release notes and docset.yml (#9865) fix: correct spelling of 'output' in various templates and monitoring code (#9827) k8s: Add comment around hostUsers for Universal Profiling deployments (#9847) ...
* fix(tests): update liveness/readiness test cases to assert status code and remove unused vars * fix: correct order of fields in LivenessFailConfig for degraded state * fix: remove unnecessary check for coordinator mode in liveness handler (already handled) * fix: add unhealthy coordinator state handling in liveness handler * add changelog fragment




What does this PR do?
This PR includes the aggregated status of the agent node to the liveness health check.
As a bonus, it also adds status code assertion to the tests, which were missing before. (All liveness/readiness tests were passing without any assertions).
Why is it important?
Checklist
./changelog/fragmentsusing the changelog toolDisruptive User Impact
Liveness probes will now fail if the configuration is invalid, likely causing the container to be restarted (see https://kubernetes.io/docs/concepts/configuration/liveness-readiness-startup-probes/#liveness-probe).
How to test this PR locally
elastic-agent.ymlfile with an invalid output, i.e setuse_output: nonexistentelastic-agent statusRelated issues