[One workflow] Workflow execution error trigger (workflows.failed) by yngrdyn · Pull Request #257633 · elastic/kibana

yngrdyn · 2026-03-13T11:37:37Z

Closes https://github.com/elastic/security-team/issues/14421.

This PR implements the Workflow execution error Trigger: when a workflow run fails, the platform emits a workflows.failed event so that other workflows subscribed to it can run (notifications, cleanup, retries). It also serves as a reference implementation for solution teams adding their own event-driven triggers.

Summary

When a workflow execution reaches a failed terminal state, the execution engine builds an event payload (workflow id/name, execution id, error message, failed step id/name, optional stack trace) and calls workflowsExtensions.emitEvent() with trigger id workflows.executionFailed. The existing trigger event handler (from #254964) resolves workflows subscribed to that trigger in the same space, evaluates optional KQL on.condition against the event, and runs only matching workflows with the event as context.event. The payload includes workflow.isErrorHandler: true when the failed run was itself triggered by an error event, so subscribers can filter out error-handler failures and avoid infinite loops.

graph TB
    subgraph Engine["workflows_execution_engine"]
        Run["runWorkflow() / resumeWorkflow()"]
        Fail["Execution fails → failStep()"]
        Finally["finally: load execution, build payload"]
        Emit["emitEvent(workflows.executionFailed, payload, spaceId, request)"]
    end

    subgraph Subscriber["Error-handling workflow"]
        Steps["Steps use {{ event.workflow.id }}, {{ event.error.message }}, etc."]
    end

    Run --> Fail
    Fail --> Finally
    Finally --> Emit
    Emit --> Steps

    style Engine fill:#e1f5ff
    style Subscriber fill:#e8f5e9

Event payload (and thus context.event in subscriber workflows):

workflow: id, name, spaceId, isErrorHandler
execution: id, startedAt, failedAt
error: message, stepId, stepName, optional stepExecutionId, optional stackTrace

Conditions and steps can use e.g. event.workflow.name, event.error.stepName, event.execution.id, and not event.workflow.isErrorHandler:true to avoid handling failures from error-handler workflows.

What's in this PR

Trigger registration (workflows_extensions)

Common: WORKFLOW_EXECUTION_FAILED_TRIGGER_ID, Zod workflowExecutionFailedEventSchema (workflow, execution, error with optional stepExecutionId and stackTrace), i18n for schema descriptions.
Server: Trigger definition registered in workflows_extensions plugin setup; used for validation when emitting and for internal trigger-definitions API.
Public: PublicTriggerDefinition with i18n title, description, documentation, and examples so the workflow authoring UI shows the trigger and its event shape.

Emit on failure (`workflows_execution_engine`)

Payload builder: buildWorkflowExecutionFailedPayload(execution, failedStepContext?) in server/lib/build_workflow_execution_failed_payload.ts. Step context (stepId, stepName, stepExecutionId, stackTrace) comes from in-memory FailedStepContext set in failStep(); not from execution.error.details or ES step executions (avoids refresh delays).
Failure context: In step_execution_runtime.ts, failStep() calls workflowExecutionState.setLastFailedStepContext({ stepId, stepName, stepExecutionId, stack }) so the payload builder can read it in the same run.
Emission: In run_workflow.ts and resume_workflow.ts, in a finally block: if execution?.status === FAILED and not a test run, build payload (with workflowExecutionState.getLastFailedStepContext()) and call workflowsExtensions.emitEvent({ triggerId: WORKFLOW_EXECUTION_FAILED_TRIGGER_ID, spaceId, payload, request }). Ensures one emit per failed run and consistent metering.

How to verify

Start Kibana with the workflows extensions example:
```
yarn start
```
Create a workflow that always fails.

name: Always fails
enabled: true
triggers:
  - type: manual
steps:
  - name: log_start
    type: console
    with:
      message: "Workflow started; next step will fail."
  - name: http_always_500
    type: http
    with:
      url: "https://httpstat.us/500"
      method: GET

Create a second workflow with trigger workflows.executionFailed and a step that logs or notifies

name: Workflow failure monitor
description: Sends a Slack notification with full details when any workflow in the space fails.
enabled: true
triggers:
  - type: workflows.executionFailed
    on:
      condition: not event.workflow.isErrorHandler:true
steps:
  - name: slack_alert
    type: slack
    connector-id: c57c5a7b-dc2b-4d64-b9bd-a02c92696e03
    with:
      message: |
        :alert: *Workflow execution failed*

        *Workflow:* {{ event.workflow.name }}
        *Workflow ID:* `{{ event.workflow.id }}`
        *Space:* {{ event.workflow.spaceId }}

        *Failed step:* {{ event.error.stepName }}
        *Error:* {{ event.error.message }}

        *Execution ID:* {{ event.execution.id }}
        *Started:* {{ event.execution.startedAt }}
        *Failed at:* {{ event.execution.failedAt }}
        {% if event.error.stackTrace %}
        *Stack trace:*
        ```
        {{ event.error.stackTrace }}
        ```
        {% endif %}

        *View execution in Kibana:*
        {{kibanaUrl}}{% if event.workflow.spaceId != 'default' %}/s/{{ event.workflow.spaceId }}{% endif %}/app/workflows/{{ event.workflow.id }}?executionId={{ event.execution.id }}&tab=executions&stepExecutionId={{ event.error.stepExecutionId }}

Run the first workflow; wait for it to fail.
Confirm the second workflow runs and receives the event (e.g. triggeredBy: 'workflows.executionFailed', step sees event.workflow.name, event.error.message).

Release note

Added workflows.executionFailed trigger so you can run workflows when another workflow fails. Use it to send notifications (e.g. Slack), run cleanup, or trigger retries. Subscriber workflows receive an event with workflow and execution details, the error message, and the failed step. #257633

...form/plugins/shared/workflows_execution_engine/server/execution_functions/resume_workflow.ts

…ce-implementation

Copilot

Pull request overview

Implements a new event-driven trigger (workflows.executionFailed) emitted when a workflow run reaches a failed terminal state, enabling subscriber workflows (notifications/cleanup/retries) to react with the failure event context.

Changes:

Adds common/server/public trigger definitions + schema for workflows.executionFailed and registers them in workflows_extensions.
Emits workflows.executionFailed from the execution engine on failed runs (run + resume paths) with step failure context captured at failure time.
Adds unit + Scout API tests and updates trigger-definition approval fixtures.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
src/platform/plugins/shared/workflows_management/test/scout_workflows_ui/api/tests/workflow_execution/workflow_error_trigger.spec.ts	Adds Scout test validating a subscriber workflow runs on execution failure.
src/platform/plugins/shared/workflows_extensions/test/scout/api/tests/trigger_definitions_approval.spec.ts	Updates tags configuration for trigger definitions approval test coverage.
src/platform/plugins/shared/workflows_extensions/test/scout/api/fixtures/approved_trigger_definitions.ts	Approves the new trigger id + schema hash.
src/platform/plugins/shared/workflows_extensions/server/triggers/workflow_execution_failed.ts	Introduces server trigger definition wrapper/export for the new trigger.
src/platform/plugins/shared/workflows_extensions/server/triggers/index.ts	Registers internal trigger definitions in the server plugin.
src/platform/plugins/shared/workflows_extensions/server/plugin.ts	Calls internal trigger registration during setup.
src/platform/plugins/shared/workflows_extensions/server/index.ts	Exposes trigger id/type from the server package.
src/platform/plugins/shared/workflows_extensions/public/triggers/workflow_execution_failed.ts	Adds UI-facing trigger metadata, docs, examples, and snippet.
src/platform/plugins/shared/workflows_extensions/public/triggers/index.ts	Registers public trigger definition.
src/platform/plugins/shared/workflows_extensions/public/plugin.ts	Calls public trigger registration during setup.
src/platform/plugins/shared/workflows_extensions/common/triggers/workflow_execution_failed.ts	Defines trigger id + Zod schema + common trigger definition.
src/platform/plugins/shared/workflows_extensions/common/triggers/index.ts	Re-exports common trigger artifacts.
src/platform/plugins/shared/workflows_extensions/common/index.ts	Exposes trigger artifacts from common entrypoint.
src/platform/plugins/shared/workflows_execution_engine/server/workflow_context_manager/workflow_execution_state.ts	Adds in-memory failed-step context storage for event payload building.
src/platform/plugins/shared/workflows_execution_engine/server/workflow_context_manager/step_execution_runtime.ts	Captures failed step context in `failStep()` for later emission.
src/platform/plugins/shared/workflows_execution_engine/server/lib/build_workflow_execution_failed_payload.ts	Adds payload builder for the emitted event.
src/platform/plugins/shared/workflows_execution_engine/server/lib/build_workflow_execution_failed_payload.test.ts	Adds unit tests for the payload builder.
src/platform/plugins/shared/workflows_execution_engine/server/execution_functions/run_workflow.ts	Emits the event in a finally block when status is FAILED.
src/platform/plugins/shared/workflows_execution_engine/server/execution_functions/run_workflow.test.ts	Adds unit tests asserting emission behavior in run path.
src/platform/plugins/shared/workflows_execution_engine/server/execution_functions/resume_workflow.ts	Emits the event in a finally block when a resumed execution fails.
src/platform/plugins/shared/workflows_execution_engine/server/execution_functions/resume_workflow.test.ts	Adds unit tests asserting emission behavior in resume path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

...gins/shared/workflows_execution_engine/server/lib/build_workflow_execution_failed_payload.ts

...shared/workflows_execution_engine/server/lib/build_workflow_execution_failed_payload.test.ts

...gins/shared/workflows_execution_engine/server/lib/build_workflow_execution_failed_payload.ts

.../shared/workflows_execution_engine/server/workflow_context_manager/step_execution_runtime.ts

...latform/plugins/shared/workflows_execution_engine/server/execution_functions/run_workflow.ts

...nagement/test/scout_workflows_ui/api/tests/workflow_execution/workflow_error_trigger.spec.ts

…tation' of github.com:yngrdyn/kibana into 14421-feature-workflow-error-trigger-reference-implementation

...latform/plugins/shared/workflows_execution_engine/server/execution_functions/run_workflow.ts

.../shared/workflows_execution_engine/server/workflow_context_manager/step_execution_runtime.ts

...form/plugins/shared/workflows_execution_engine/server/execution_functions/resume_workflow.ts

src/platform/plugins/shared/workflows_extensions/common/triggers/workflow_execution_failed.ts

…ce-implementation

skynetigor

LGTM

jbudz

packages/kbn-optimizer/limits.yml LGTM

…ce-implementation

…w-error-trigger-reference-implementation

.../shared/workflows_execution_engine/server/workflow_context_manager/step_execution_runtime.ts

src/platform/plugins/shared/workflows_extensions/public/triggers/workflow_execution_failed.ts

…ce-implementation

elasticmachine · 2026-04-06T15:22:17Z

💚 Build Succeeded

Buildkite Build
Commit: a74aa1a

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`workflowsExtensions`	239	244	+5

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`workflowsExtensions`	25	33	+8

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`workflowsExtensions`	52.1KB	55.9KB	+3.9KB
`workflowsManagement`	2.3MB	2.3MB	+186.0B
total			+4.0KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`workflowsExtensions`	34.5KB	37.6KB	+3.0KB

Unknown metric groups

API count

id	before	after	diff
`workflowsExtensions`	109	117	+8

async chunk count

id	before	after	diff
`workflowsExtensions`	21	23	+2

History

💚 Build #418578 succeeded feb459d
💛 Build #417525 was flaky 24e1451
💔 Build #417111 failed ad392a3
💔 Build #416422 failed a067a68
💔 Build #416282 failed 4db091b
💔 Build #410528 failed cf647f6

cc @yngrdyn

yngrdyn added 3 commits March 12, 2026 13:16

workflows.executionFailed registration

04b351d

emit event on workflow failure

c44447e

Adding tests + approval process

5544fc4

yngrdyn self-assigned this Mar 13, 2026

yngrdyn requested a review from a team as a code owner March 13, 2026 11:37

yngrdyn added backport:skip This PR does not require backporting release_note:feature Makes this part of the condensed release notes labels Mar 13, 2026

botelastic bot added the Team:One Workflow Team label for One Workflow (Workflow automation) label Mar 13, 2026

Small fixes

531e033

yngrdyn changed the title ~~[One workflow] Workflow execution error Trigger (workflows.executionFailed)~~ [One workflow] Workflow execution error trigger (workflows.executionFailed) Mar 13, 2026

yngrdyn commented Mar 13, 2026

View reviewed changes

...form/plugins/shared/workflows_execution_engine/server/execution_functions/resume_workflow.ts Outdated Show resolved Hide resolved

Small fixes

644884d

yngrdyn requested a review from Copilot March 13, 2026 11:57

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

9d8cb91

…ce-implementation

Copilot AI reviewed Mar 13, 2026

View reviewed changes

yngrdyn added 2 commits March 13, 2026 14:33

Merge branch '14421-feature-workflow-error-trigger-reference-implemen…

d029629

…tation' of github.com:yngrdyn/kibana into 14421-feature-workflow-error-trigger-reference-implementation

fixing tests

b9c2362

skynetigor reviewed Mar 13, 2026

View reviewed changes

...latform/plugins/shared/workflows_execution_engine/server/execution_functions/run_workflow.ts Outdated Show resolved Hide resolved

dej611 reviewed Mar 13, 2026

View reviewed changes

yngrdyn added 2 commits March 13, 2026 15:31

Removing stacktrace since it was exposing kibana internals

060d423

Support aync trigger registration

e79ab11

yngrdyn commented Mar 13, 2026

View reviewed changes

src/platform/plugins/shared/workflows_extensions/common/triggers/workflow_execution_failed.ts Show resolved Hide resolved

fixing build

6c5f007

yngrdyn requested a review from a team as a code owner March 13, 2026 15:33

yngrdyn added 3 commits March 13, 2026 16:51

fixing build

f705d77

Addressing CR comments

11bdc43

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

ed83458

…ce-implementation

skynetigor approved these changes Mar 13, 2026

View reviewed changes

jbudz approved these changes Mar 13, 2026

View reviewed changes

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

b8f9d22

…ce-implementation

yngrdyn added the ci:build-cloud-image label Mar 16, 2026

yngrdyn added 4 commits March 16, 2026 12:29

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

1c7b4f2

…ce-implementation

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

cf647f6

…ce-implementation

Merge remote-tracking branch 'origin/main' into 14421-feature-workflo…

a500504

…w-error-trigger-reference-implementation

Merge remote-tracking branch 'origin/main' into 14421-feature-workflo…

4db091b

…w-error-trigger-reference-implementation

macroscopeapp bot reviewed Mar 25, 2026

View reviewed changes

.../shared/workflows_execution_engine/server/workflow_context_manager/step_execution_runtime.ts Show resolved Hide resolved

src/platform/plugins/shared/workflows_extensions/public/triggers/workflow_execution_failed.ts Outdated Show resolved Hide resolved

yngrdyn added 10 commits March 25, 2026 15:43

fixing build

26d5093

macroscope CR changes

df5b34b

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

a067a68

…ce-implementation

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

ad392a3

…ce-implementation

fixing build

8e5eac7

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

9fc4488

…ce-implementation

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

24e1451

…ce-implementation

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

feb459d

…ce-implementation

Merge branch 'main' into 14421-feature-workflow-error-trigger-referen…

19d07ed

…ce-implementation

rename workflows.executionFailed to workflows.failed

a74aa1a

yngrdyn changed the title ~~[One workflow] Workflow execution error trigger (workflows.executionFailed)~~ [One workflow] Workflow execution error trigger (workflows.failed) Apr 6, 2026

yngrdyn merged commit 1740ee6 into elastic:main Apr 6, 2026
19 checks passed

kibanamachine added the v9.4.0 label Apr 6, 2026

Conversation

yngrdyn commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Trigger registration (workflows_extensions)

Emit on failure (workflows_execution_engine)

How to verify

Release note

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skynetigor left a comment

Choose a reason for hiding this comment

Uh oh!

jbudz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

elasticmachine commented Apr 6, 2026

💚 Build Succeeded

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

Page load bundle

API count

async chunk count

History

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

yngrdyn commented Mar 13, 2026 •

edited

Loading

Emit on failure (`workflows_execution_engine`)