External Triggers for Materializations: Implementation Decisions #2713

williamhbaker · 2026-02-25T23:15:12Z

williamhbaker
Feb 25, 2026
Maintainer

External Triggers for Materializations

This feature adds webhook triggers to materializations. When a materialization transaction commits, configured webhooks fire with metadata about the transaction. This lets users integrate external systems (Airflow, dbt, Slack, etc.) that need to react when new data lands.

Ref: Internal PRD document

This discussion covers the design choices I'd like input on.

Templating

Trigger payloads are Handlebars templates rendered against a set of variables. Handlebars is already used for email templates in the notifications crate, and it supports iteration (needed for collection_names).

Templates run in strict mode (unknown variables are errors) with no HTML escaping (payloads are JSON, not HTML).

At publish time, validation trial-renders every template with placeholder values and checks that the output is valid JSON.

Trigger variables

These are the variables available in templates:

Variable	Type	Source
`materialization_name`	`string`	`task.shard_ref.name` — the full materialization name
`collection_names`	`string[]`	Collection names of bindings that received documents in the transaction
`connector_image`	`string`	Connector image name from the spec (e.g. `ghcr.io/estuary/materialize-postgres:v1`)
`flow_published_at_max`	`string` (RFC 3339)	Latest `_meta/uuid` clock across all documents in the transaction
`flow_published_at_min`	`string` (RFC 3339)	Earliest `_meta/uuid` clock across all documents in the transaction (new)
`flow_run_id`	`string`	Fresh UUID generated per transaction — an idempotency key

Variables from the PRD that are not planned to be supported initially:

transaction_started_at and transaction_completed_at: Using min/max flow_published_at instead
updated_documents: Not practical to list every updated document's key
deployment_env: Unsure of what this refers to

Example config

triggers:
  config:
    - url: "https://airflow.example.com/api/v1/dags/my_dag/dagRuns"
      method: POST
      auth:
        type: Bearer
        token: "my-airflow-token"
      payloadTemplate: |
        {
          "conf": {
            "materialization": "{{materialization_name}}",
            "collections": [{{#each collection_names}}"{{this}}"{{#unless @last}},{{/unless}}{{/each}}],
            "published_min": "{{flow_published_at_min}}",
            "published_max": "{{flow_published_at_max}}",
            "run_id": "{{flow_run_id}}"
          }
        }
      timeoutSecs: 30
      maxRetries: 3

Trigger config encryption

Trigger configs contain auth credentials (bearer tokens, API keys, passwords). These need to be SOPS-encrypted, like connector endpoint configs.

Connector endpoint configs are handled transparently as raw bytes by the control plane and runtime, decrypting them when needed to run connectors. This is a bit awkward for trigger configurations, because their models are defined in Flow, rather than a connector, and the control plane and runtime need to deserialize the configurations into these typed models for validation and execution.

The solution I've come up with is to mostly handle them as raw bytes so they can be encrypted the same way as connector configs, and deserialize them when needed. Full decryption & deserialization is needed for executing the triggers while a materialization is running. For validation, only the trigger payload needs to be extracted to validate.

So MaterializationDef.triggers is typed as Option<RawValue> rather than Option<Triggers>:

pub struct MaterializationDef {
    // ...
    #[schemars(with = "crate::triggers::Triggers")]
    pub triggers: Option<RawValue>,
    // ...
}

The built spec also carries the trigger config as opaque bytes:

message MaterializationSpec {
  // ...
  bytes triggers_json = 10 [
    (gogoproto.casttype) = "encoding/json.RawMessage",
    json_name = "triggers"
  ];
  // ...
}

In the publish path (via flowctl in draft/encrypt.rs, or a future UI workflow), triggers are encrypted similarly to endpoint configs using the same config encryption API. This requires 2 calls to the encryption service when triggers are configured: One for the endpoint config, and an additional for the trigger config

Getting the JSON schema to the UI

The UI will need the trigger config schema to render a form editor. This could be exposed through flow-web, with a simple helper function:

#[wasm_bindgen]
pub fn get_trigger_config_schema() -> Result<JsValue, JsValue> {
    let schema = models::triggers::triggers_schema();
    serde_wasm_bindgen::to_value(&schema)
        .map_err(|err| JsValue::from_str(&format!("{err:?}")))
}

I also tossed around the idea of a new column / table in Supabase to hold this config for queries, or somehow injecting the config into every materialization's endpoint config (likely impractical), but that did not seem like an improvement over the flow-web function.

At-least-once delivery

Triggers must fire at least once per committed transaction. They may fire more than once, are are likely to do so from time to time. The mechanism:

Compute: After the connector flushes but before commit, TriggerVariables are computed from the transaction's stats.
Persist: During StartCommit, trigger variables are serialized to RocksDB under a new "trigger-params" key, atomically in the same WriteBatch as the runtime checkpoint.
Fire: On the current transaction's Acknowledged response, the runtime reads "trigger-params" from RocksDB, renders templates, and delivers webhooks concurrently with retries.
Clear: On successful delivery of all webhooks, "trigger-params" is deleted from RocksDB. This update may not be durably committed to the recovery log until the next transaction's Load phase is complete, similar to how connector state updates from Acknowledged are made durable.

If the shard crashes after step 2 but before step 4, the trigger variables survive in RocksDB. On recovery, the next Acknowledged will re-read and re-fire the webhooks.

Permanently failing trigger webhooks will crash the shard and block the materialization. The maxRetries and timeoutSecs config give users control over how aggressively to retry before crashing. Persisting just the TriggerVariables rather than the entire webhook payload allows for correcting things like an invalidated API key, rather than dooming the task to send a payload which will never be valid again.

jgraettinger · 2026-02-27T01:04:08Z

jgraettinger
Feb 27, 2026
Maintainer

This makes lots of sense. Some feedback around the structure of triggers on the model:

I believe we can model it as something like:

pub struct Triggers {
  pub config: Vec<TriggerConfig>
  #[serde(default)]
  #[schemars(hide)]
  pub sops: Option<models::RawValue>,
}

pub struct TriggerConfig {
  pub url: String
  pub method: String
  #[schemars(extra_annotation="secret: true")] // auth: {additionalProperties: {type: string, secret: true}}
  pub auth: BTreeMap<String, String>
  pub payload_template: String
  ...
}

Key ideas:

The encrypted form of a Triggers is also a valid Triggers model (because sops encrypted values are strings, so remain compatible with the auth string => string).
The secret annotation will cause our config-encryption service to suffix just the auth map keys with _sops, marking them as encrypted (and including an "encrypted_suffix" property in the sop map).
- Because it's a string => string map, that's not a representational issue.
The unseal crate knows to look for and remove encrypted suffixes as conveyed by the sops map.

We definitely need to protect url, method, and auth via the sops HMAC. However, I'm sure that users will want to be able to change payloadTemplate / timeoutSecs / maxRetries without having to re-enter authorization keys.

I think we can do this having a logic layer above which strips these fields from the documents submitted for encrypt / decrypt, and then re-applies them to the post-transformed model. E.g., sops would only ever see a payloadTemplate: "", timeoutSecs: 0, etc, so a user changing these values would not cause an HMAC mismatch.

0 replies

jgraettinger · 2026-02-27T18:25:50Z

jgraettinger
Feb 27, 2026
Maintainer

Two other notes before I forget:

Perhaps rename auth => headers ? That's what it is, right?
If we plumb headers into payload_template as a variable, that would give a safe channel for integrating secrets into data payloads if that were ever needed, playing a similar role to github secrets in actions yaml.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

External Triggers for Materializations: Implementation Decisions #2713

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

External Triggers for Materializations: Implementation Decisions #2713

Uh oh!

Uh oh!

williamhbaker Feb 25, 2026 Maintainer

External Triggers for Materializations

Templating

Trigger variables

Example config

Trigger config encryption

Getting the JSON schema to the UI

At-least-once delivery

Replies: 2 comments

Uh oh!

jgraettinger Feb 27, 2026 Maintainer

Uh oh!

jgraettinger Feb 27, 2026 Maintainer

williamhbaker
Feb 25, 2026
Maintainer

jgraettinger
Feb 27, 2026
Maintainer

jgraettinger
Feb 27, 2026
Maintainer