[Security Solution][Cypress] Fix flaky related integrations test caused by Fleet race condition#261128
Conversation
…waiting package installation before creating agent policy Wait for each package to reach 'installed' status before calling the Fleet agent policies API, and increase the agent policy creation timeout from 30s to 60s. Previously the bulk install was fired and the agent policy request followed immediately; Fleet was still processing large packages (aws, system) in the background, causing the agent policy POST to time out under CI load. Closes elastic#259831 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
/ci |
1 similar comment
|
/ci |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]
cc @maximpn |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#11365[❌] [Serverless] Security Solution Rule Management - Cypress: 72/100 tests passed. |
|
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
|
Pinging @elastic/security-solution (Team: SecuritySolution) |
|
Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management) |
|
Running flaky test runner again since the last one seemed to fail quite a bit |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#11493[✅] [Serverless] Security Solution Rule Management - Cypress: 100/100 tests passed. |
dplumlee
left a comment
There was a problem hiding this comment.
Code makes sense to me, looks like it's passing flaky test runner too, let's see how it does in main
|
@elasticmachine merge upstream |
|
Starting backport for target branches: 8.19, 9.2, 9.3 |
…ed by Fleet race condition (elastic#261128) **Resolves: elastic#259831 **Resolves: elastic#239356 ## Summary Mitigates chances of `related_integrations.cy.ts` failure by reducing pressure on Kibana's Fleet plugin via adding extra waiting for integrations installation before adding agent policies. Generally this mitigates the risk of failure. ## Details Mitigates `related_integrations.cy.ts` flakiness reasons in the suite where `cy.request()` timed out waiting for `POST /api/fleet/agent_policies?sys_monitoring=true`. **Root cause**: `installIntegrations()` fired the bulk package install request and immediately followed with the agent policy creation request. The Fleet bulk install endpoint returns a response once the request is accepted, but processes package assets asynchronously. Under CI load, Fleet was still indexing large packages (`aws`, `system`) when the agent policy POST arrived, causing the API to become unresponsive and exceed the 30s default timeout. **Fix**: - Chain the agent policy creation inside `.then()` after the bulk install, polling `waitForPackageInstalled` for each package before proceeding. - Increase the agent policy creation timeout from 30s to 60s, as this endpoint is inherently slow with `?sys_monitoring=true`. ## Flaky test runner TBD Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 4aefe9c)
…ed by Fleet race condition (elastic#261128) **Resolves: elastic#259831 **Resolves: elastic#239356 ## Summary Mitigates chances of `related_integrations.cy.ts` failure by reducing pressure on Kibana's Fleet plugin via adding extra waiting for integrations installation before adding agent policies. Generally this mitigates the risk of failure. ## Details Mitigates `related_integrations.cy.ts` flakiness reasons in the suite where `cy.request()` timed out waiting for `POST /api/fleet/agent_policies?sys_monitoring=true`. **Root cause**: `installIntegrations()` fired the bulk package install request and immediately followed with the agent policy creation request. The Fleet bulk install endpoint returns a response once the request is accepted, but processes package assets asynchronously. Under CI load, Fleet was still indexing large packages (`aws`, `system`) when the agent policy POST arrived, causing the API to become unresponsive and exceed the 30s default timeout. **Fix**: - Chain the agent policy creation inside `.then()` after the bulk install, polling `waitForPackageInstalled` for each package before proceeding. - Increase the agent policy creation timeout from 30s to 60s, as this endpoint is inherently slow with `?sys_monitoring=true`. ## Flaky test runner TBD Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 4aefe9c)
…ed by Fleet race condition (elastic#261128) **Resolves: elastic#259831 **Resolves: elastic#239356 ## Summary Mitigates chances of `related_integrations.cy.ts` failure by reducing pressure on Kibana's Fleet plugin via adding extra waiting for integrations installation before adding agent policies. Generally this mitigates the risk of failure. ## Details Mitigates `related_integrations.cy.ts` flakiness reasons in the suite where `cy.request()` timed out waiting for `POST /api/fleet/agent_policies?sys_monitoring=true`. **Root cause**: `installIntegrations()` fired the bulk package install request and immediately followed with the agent policy creation request. The Fleet bulk install endpoint returns a response once the request is accepted, but processes package assets asynchronously. Under CI load, Fleet was still indexing large packages (`aws`, `system`) when the agent policy POST arrived, causing the API to become unresponsive and exceed the 30s default timeout. **Fix**: - Chain the agent policy creation inside `.then()` after the bulk install, polling `waitForPackageInstalled` for each package before proceeding. - Increase the agent policy creation timeout from 30s to 60s, as this endpoint is inherently slow with `?sys_monitoring=true`. ## Flaky test runner TBD Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 4aefe9c)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
Resolves: #259831
Resolves: #239356
Summary
Mitigates chances of
related_integrations.cy.tsfailure by reducing pressure on Kibana's Fleet plugin via adding extra waiting for integrations installation before adding agent policies. Generally this mitigates the risk of failure.Details
Mitigates
related_integrations.cy.tsflakiness reasons in the suite wherecy.request()timed out waiting forPOST /api/fleet/agent_policies?sys_monitoring=true.Root cause:
installIntegrations()fired the bulk package install request and immediately followed with the agent policy creation request. The Fleet bulk install endpoint returns a response once the request is accepted, but processes package assets asynchronously. Under CI load, Fleet was still indexing large packages (aws,system) when the agent policy POST arrived, causing the API to become unresponsive and exceed the 30s default timeout.Fix:
.then()after the bulk install, pollingwaitForPackageInstalledfor each package before proceeding.?sys_monitoring=true.Flaky test runner
TBD