Suites, today, operate in a stateless manner, and have a number of UX shortcomings especially when it comes to error reporting (see elastic/beats#27924). This issue is the longer term fix for #425.
The proposed fix here (originally @drewpost 's recommendation) is to use suites to control the monitor management objects in Uptime.
Implementation issues
Goals
The goal here is reduce the impedance mismatch between suites and inline monitors while still enabling a git-ops based workflow.
Personas
- Uptime user: Person using the uptime app, usually an administrator responsible for infrastructure apps run on
- Test Creator: Person responsible for writing / maintaining tests (could be a sysadmin, programmer, QA, etc.)
User Stories
This is a partial list of stories covered by this change
- As an Uptime user I would like to mute individual journeys in a suite through the Uptime UI when they fail due to a temporary condition.
- As a test creator I would like to add/remove/delete a test when desired
- As an Uptime user, when a suite fails to execute I would still like to see its individual journeys listed, and see its errors in a clear place.
Implementation
At a logical level we will need to decouple suite journey discovery from execution, leading to the following flow:
- Suite refresh interval triggered
- Suite is downloaded
- Suite is unpacked and journeys discovered via
--dry-run
- Existing monitors no longer defined in the suite are deleted (possibly as a soft delete) in kibana management
- Newly created monitors are added, existing ones are updated
- Monitors created from this process cannot be deleted or updated, though they may be disabled.
In terms of code, consider the following straw-man approach:
- When a zip url monitor is configured it now no executes journeys, but emits a
manifest document to ES describing which journeys were discovered in the zip
- A kibana background job checks for the latest
manifest documents and uses them to update central monitor management
- These newly configured monitors act as zip monitors currently do, filtering only for the journey they are set to run.
This would look something like:

The desirable properties of this approach are that:
- It doesn't require an excessive number of changes to our current stack
- It maintains the current security profile, since discovery of monitors requires arbitrary code execution it is simplest to do it in heartbeat still
Needs design etc. CC @liciavale
Suites, today, operate in a stateless manner, and have a number of UX shortcomings especially when it comes to error reporting (see elastic/beats#27924). This issue is the longer term fix for #425.
The proposed fix here (originally @drewpost 's recommendation) is to use suites to control the monitor management objects in Uptime.
Implementation issues
Goals
The goal here is reduce the impedance mismatch between suites and inline monitors while still enabling a git-ops based workflow.
Personas
User Stories
This is a partial list of stories covered by this change
Implementation
At a logical level we will need to decouple suite journey discovery from execution, leading to the following flow:
--dry-runIn terms of code, consider the following straw-man approach:
manifestdocument to ES describing which journeys were discovered in the zipmanifestdocuments and uses them to update central monitor managementThis would look something like:
The desirable properties of this approach are that:
Needs design etc. CC @liciavale