[otelmanager] emit starting state in the beginning #11234
[otelmanager] emit starting state in the beginning #11234VihasMakwana merged 10 commits intoelastic:mainfrom
Conversation
|
This pull request does not have a backport label. Could you fix it @VihasMakwana? 🙏
|
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
|
CI failing due to known issue. It will be fixed via #11238 |
swiatekm
left a comment
There was a problem hiding this comment.
I'm not sure I like this approach.
If we need to always emit a STARTING state, then we should do so explicitly, without waiting for the health check first. And if there's an error getting the status, we should set it as an error on the collector instead of doing this. Finally, if some errors from the collector need to be turned into a specific status for all the components, then that should happen separately somewhere in the otel manager.
Does that make sense?
5b4e133 to
ad7b7fd
Compare
06d154b to
57d98a5
Compare
9592c47 to
6310b4c
Compare
|
The CI has been failing but the test passes on my Mac consistently. Weird. |
|
We already emit a elastic-agent/internal/pkg/otel/manager/execution_subprocess.go Lines 154 to 155 in 2b2dff7 We just need to handle that case while agent translation. |
swiatekm
left a comment
There was a problem hiding this comment.
I'd like an additional unit test, other than that LGTM.
⏳ Build in-progress, with failures
Failed CI Steps
History
|
|
/test |
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut (cherry picked from commit 14c87c2)
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut (cherry picked from commit 14c87c2)
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut (cherry picked from commit 14c87c2) # Conflicts: # internal/pkg/otel/manager/execution_subprocess.go # internal/pkg/otel/manager/manager_test.go
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut (cherry picked from commit 14c87c2) Co-authored-by: Vihas Makwana <121151420+VihasMakwana@users.noreply.github.com>
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut (cherry picked from commit 14c87c2) Co-authored-by: Vihas Makwana <121151420+VihasMakwana@users.noreply.github.com>
* feat: emit starting state * test * rename method * emit starting state * fix test * comments * fix case * ut
What does this PR do?
This PR introduces support for emitting a STARTING state when the collector is expect the collector to start.
Why is it important?
Right now, we default to a
STOPPEDstate whenever an error occurs while accessing the healthcheck port. As a result, theelastic-agent statusoutput does not show any monitoring components. If this error occurs consistently every time the collector starts, those components will never appear inelastic-agent status.Checklist
./changelog/fragmentsusing the changelog toolHow to test this PR locally
For mac,
Startingstate, thenFailedstate with messageOTel manager failed ... process exited with status 1.Related issues