TestLongRunningAgentForLeaks is failing due to two reasons:
- Flaky for
linux-amd64
- It fails due to the
monitoring module staying in a degraded mode for a while, it will eventually return to a healthy state. It takes a few dozen seconds for libbeat to start the metrics endpoint and we only wait for 10 seconds.
- FIX: Waiting for 60s should solve this
EDIT: Following is fixed:
- Agent remaining DEGRADED for windows, even while running as an administrator.
- This seems like a genuine issue with windows and this requires some debugging and troubleshooting. I'm currently working on it
- This only affects
system.process and system.process_summary.
- TEMPORARY FIX TO UNBLOCK CI: disable
system.process and system.process_summary from TestLongRunningAgentForLeaks. (ps: Let me know if this sounds good to you?)
- PERMANENT FIX: Investigate and resolve the underlying issue affecting these metrics.
Example of an failure:
https://buildkite.com/elastic/elastic-agent-extended-testing/builds/1850#01913472-043d-4264-b524-8c1dfa828833
TestLongRunningAgentForLeaksis failing due to two reasons:linux-amd64monitoringmodule staying in a degraded mode for a while, it will eventually return to a healthy state. It takes a few dozen seconds forlibbeatto start the metrics endpoint and we only wait for 10 seconds.EDIT: Following is fixed:
Example of an failure:
https://buildkite.com/elastic/elastic-agent-extended-testing/builds/1850#01913472-043d-4264-b524-8c1dfa828833