Skip to content

feat(workflows): emit workflows engagement events from plugin-server#60212

Draft
dmarchuk wants to merge 11 commits into
masterfrom
feat/messaging-posthog-events-plugin-server
Draft

feat(workflows): emit workflows engagement events from plugin-server#60212
dmarchuk wants to merge 11 commits into
masterfrom
feat/messaging-posthog-events-plugin-server

Conversation

@dmarchuk

@dmarchuk dmarchuk commented May 27, 2026

Copy link
Copy Markdown
Contributor

Problem

Follow-up to #50122 (which ships the Django migration, model, API serializer, frontend UI, and IDOR registration). The Migration & Service Separation Check blocks PRs that contain both Django migrations and nodejs/ changes, so the plugin-server side is split out here. #50122 must merge first — it creates workflows_teamworkflowsconfig; until then the config queries would 42P01.

Relationship to #60229 (please read before merging)

This branch currently also carries its own copy of the email tracking-code HMAC signing + X-PostHog-Tracking-Code header carrier, which is the subject of the dedicated PR #60229. The two copies are independent (not shared commits), and this branch's copy predates #60229's hardening — the raw-delivery rejection and the SES-tag assertion tightening are not here.

Intended resolution: merge #60229 first, then rebase this branch and drop the duplicated signing so this PR only adds the engagement events + distinct_id threading on top. Treat the signing code here as belonging to #60229's review, not this one.

Changes (this PR's contribution)

  • TeamWorkflowsConfigServiceLazyLoader-backed reader over workflows_teamworkflowsconfig. Default-deny when no row exists, so teams that haven't opted in are unaffected.
  • EmailService — emits $workflows_email_sent / $workflows_email_failed (onto result.capturedPostHogEvents) when the team has capture_workflows_engagement_events = true and a distinct_id resolves. Fires for both the SES and maildev send paths.
  • EmailTrackingService — queues $workflows_email_opened, _link_clicked, _delivered, _bounced, _blocked, _failed from SES webhooks, backed by an explicit allowlist (METRIC_NAME_TO_EVENT_NAME) so unmapped internal metrics can't leak into customer event streams.
  • resolveEmailEngagementDistinctId — falls back event.distinct_idperson.id, so batch / scheduled workflows (empty triggering distinct_id) still produce attributable events.
  • SES webhook timestamps — threaded through so events reflect when the action happened, not when AWS delivered the notification.
  • Dev/test pixel + redirect handlers — emit metrics locally (maildev has no webhooks) so the pipeline can be exercised end-to-end.

Shared with #60229, to be deduped: the distinct_id tracking-code segment, the header carrier, and the HMAC signing itself.

Events emitted

Internal metric PostHog event
email_sent $workflows_email_sent
email_failed $workflows_email_failed
email_delivered $workflows_email_delivered
email_opened $workflows_email_opened
email_link_clicked $workflows_email_link_clicked
email_bounced $workflows_email_bounced
email_blocked $workflows_email_blocked

Properties: $workflow_id on every event, plus $messaging_source (direct or ses) on webhook-sourced events. SES events also carry $email_to; clicks carry $link_url; send-time events carry $email_to and $email_subject.

Note: the events are $workflows_email_* but the source property is still named $messaging_source — worth deciding whether to rename it for consistency.

How did you test this code?

  • pnpm exec tsc --noEmit — clean for this PR's files
  • pnpm jest src/cdp/services/messaging/ --runInBand — passing
  • End-to-end locally (maildev): enabled engagement events via the feat(workflows): add team config + settings UI for messaging engagement events #50122 settings toggle, ran a batch-triggered email workflow, and confirmed a $workflows_email_sent event landed in ClickHouse for the team with $workflow_id / $email_to / $email_subject. Opens/clicks/bounces require the real SES webhook path and were not exercised locally.

Deploy order

  1. feat(workflows): add team config + settings UI for messaging engagement events #50122 first (creates workflows_teamworkflowsconfig) — merged.
  2. feat(workflows): HMAC-sign email tracking codes #60229 (signing) — recommended before this, to avoid shipping the duplicated/un-hardened signing carried here.
  3. This PR.

Publish to changelog?

no — covered by #50122.

🤖 Agent context

  • Claude Code (Opus 4.8), invoked by Daniel Marchuk.
  • This PR is the nodejs/** half of the original combined change, split to satisfy the migration-separation check. The messagingworkflows rename (event names and the capture_workflows_engagement_events config field) was applied across both halves.
  • resolveEmailEngagementDistinctId was added because the original silently dropped engagement events for any workflow whose triggering event had no distinct_id (scheduled, segment, batch); falling back to person.id keeps batch-targeted emails attributable.
  • The METRIC_NAME_TO_EVENT_NAME allowlist is intentionally explicit (no automatic fallback) so the public event surface is curated, not a mirror of MinimalAppMetric['metric_name'].
  • Known follow-up: the HMAC signing here overlaps feat(workflows): HMAC-sign email tracking codes #60229 and should be deduped once feat(workflows): HMAC-sign email tracking codes #60229 lands (see "Relationship to feat(workflows): HMAC-sign email tracking codes #60229").

🤖 Generated with Claude Code

Reads the workflows_teamworkflowsconfig table to decide whether to
capture $messaging_email_* events. Depends on PR #50122 landing
first so the table exists.
@greptile-apps

greptile-apps Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Comments Outside Diff (1)

  1. nodejs/src/cdp/services/messaging/email-tracking.service.test.ts, line 365-384 (link)

    P2 Non-parameterised tests for resolveEmailEngagementDistinctId — the three cases differ only in the input globals and the expected value, which is exactly the shape that it.each() is designed for. As a new public helper this is also a good place to demonstrate the pattern for future contributors.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: nodejs/src/cdp/services/messaging/email-tracking.service.test.ts
    Line: 365-384
    
    Comment:
    Non-parameterised tests for `resolveEmailEngagementDistinctId` — the three cases differ only in the input globals and the expected value, which is exactly the shape that `it.each()` is designed for. As a new public helper this is also a good place to demonstrate the pattern for future contributors.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
nodejs/src/cdp/services/messaging/email.service.ts:138-142
The `$messaging_source` property is missing from these events. The PR description explicitly states "Properties on every event: `$workflow_id`, `$messaging_source`", and every other event path (all events flowing through `trackMetric`) does include it. `email_sent`/`email_failed` are always emitted synchronously by the service itself, so the source is always `'direct'`.

```suggestion
                properties: {
                    $workflow_id: invocation.functionId,
                    $messaging_source: 'direct',
                    $email_to: params.to.email,
                    $email_subject: params.subject,
                },
```

### Issue 2 of 3
nodejs/src/cdp/services/messaging/email-tracking.service.ts:62
Using `||` here will silently skip a truthy-looking but falsy-in-JS `distinct_id` such as `"0"` and fall through to `person.id`, misattributing the event. An explicit empty-string check avoids the ambiguity.

```suggestion
    const distinctId = invocation.state?.globals?.event?.distinct_id
    return (distinctId !== '' && distinctId != null ? distinctId : invocation.state?.globals?.person?.id) || undefined
```

### Issue 3 of 3
nodejs/src/cdp/services/messaging/email-tracking.service.test.ts:365-384
Non-parameterised tests for `resolveEmailEngagementDistinctId` — the three cases differ only in the input globals and the expected value, which is exactly the shape that `it.each()` is designed for. As a new public helper this is also a good place to demonstrate the pattern for future contributors.

Reviews (1): Last reviewed commit: "feat(workflows): emit messaging engageme..." | Re-trigger Greptile

Comment on lines +138 to +142
properties: {
$workflow_id: invocation.functionId,
$email_to: params.to.email,
$email_subject: params.subject,
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The $messaging_source property is missing from these events. The PR description explicitly states "Properties on every event: $workflow_id, $messaging_source", and every other event path (all events flowing through trackMetric) does include it. email_sent/email_failed are always emitted synchronously by the service itself, so the source is always 'direct'.

Suggested change
properties: {
$workflow_id: invocation.functionId,
$email_to: params.to.email,
$email_subject: params.subject,
},
properties: {
$workflow_id: invocation.functionId,
$messaging_source: 'direct',
$email_to: params.to.email,
$email_subject: params.subject,
},
Prompt To Fix With AI
This is a comment left during a code review.
Path: nodejs/src/cdp/services/messaging/email.service.ts
Line: 138-142

Comment:
The `$messaging_source` property is missing from these events. The PR description explicitly states "Properties on every event: `$workflow_id`, `$messaging_source`", and every other event path (all events flowing through `trackMetric`) does include it. `email_sent`/`email_failed` are always emitted synchronously by the service itself, so the source is always `'direct'`.

```suggestion
                properties: {
                    $workflow_id: invocation.functionId,
                    $messaging_source: 'direct',
                    $email_to: params.to.email,
                    $email_subject: params.subject,
                },
```

How can I resolve this? If you propose a fix, please make it concise.

export const resolveEmailEngagementDistinctId = (
invocation: Pick<CyclotronJobInvocationHogFunction, 'state'>
): string | undefined => {
return invocation.state?.globals?.event?.distinct_id || invocation.state?.globals?.person?.id || undefined

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Using || here will silently skip a truthy-looking but falsy-in-JS distinct_id such as "0" and fall through to person.id, misattributing the event. An explicit empty-string check avoids the ambiguity.

Suggested change
return invocation.state?.globals?.event?.distinct_id || invocation.state?.globals?.person?.id || undefined
const distinctId = invocation.state?.globals?.event?.distinct_id
return (distinctId !== '' && distinctId != null ? distinctId : invocation.state?.globals?.person?.id) || undefined
Prompt To Fix With AI
This is a comment left during a code review.
Path: nodejs/src/cdp/services/messaging/email-tracking.service.ts
Line: 62

Comment:
Using `||` here will silently skip a truthy-looking but falsy-in-JS `distinct_id` such as `"0"` and fall through to `person.id`, misattributing the event. An explicit empty-string check avoids the ambiguity.

```suggestion
    const distinctId = invocation.state?.globals?.event?.distinct_id
    return (distinctId !== '' && distinctId != null ? distinctId : invocation.state?.globals?.person?.id) || undefined
```

How can I resolve this? If you propose a fix, please make it concise.


const eventName = METRIC_NAME_TO_EVENT_NAME[metricName]
if (eventName && distinctId && (await this.teamWorkflowsConfigService.shouldCaptureEngagementEvents(teamId))) {
await this.capturedEventsService.queueEvent({

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Unauthenticated SES webhook injection: raw-mode bypass + new event capture path

The SES webhook handler verifies SNS signatures only when the request body contains an SNS envelope (checks 'envelope' in parsed). Posting a raw JSON array to the public /public/m/ses_webhook endpoint instead is parsed as mode: 'raw' — the signature check is skipped entirely.

Before this PR the bypass had minimal impact (only inflated metric counters). This PR adds a concrete consequence: the new capturedEventsService.queueEvent call on this line fires for every processed SES metric. An attacker who has a valid tracking code (extractable from any marketing email sent through a PostHog workflow — the ph_id tag encodes functionId in plain base64) can inject arbitrary $messaging_email_link_clicked / $messaging_email_opened / etc. events into a team's PostHog analytics with fully attacker-controlled distinct_id, $link_url, $email_to, and timestamp. The only gate is the team having capture_engagement_events=true, but opt-in teams are fully exposed.

Exploit path: POST /public/m/ses_webhook with body [{"eventType":"Click","mail":{"timestamp":"...","source":"x","messageId":"x","destination":["victim@corp.com"],"tags":{"ph_id":["<stolen_ph_id>"]}},"click":{"link":"https://evil.com","timestamp":"2020-01-01T00:00:00Z"}}] — no authentication required.

Prompt To Fix With AI
In `nodejs/src/cdp/services/messaging/helpers/ses.ts`, the `handleWebhook` method must also reject or verify raw-mode (array) requests that arrive without a verified SNS signature. Two options:

Option A — refuse raw mode from unauthenticated callers: In `parseIncomingBody`, treat any non-envelope body as suspect and return a marker flag. In `handleWebhook`, when `verifySignature` is true and mode is `'raw'`, return `{ status: 403, body: { error: 'Raw SES delivery not accepted; use SNS envelope mode' } }`.

Option B — add a shared secret: Add a configurable `SES_WEBHOOK_SECRET` env var. Require callers to include it as a query parameter or `X-Webhook-Secret` header. Check it in `handleSesWebhook` before delegating to `sesWebhookHandler.handleWebhook`. Apply the same check regardless of envelope vs raw mode.

Option A is simpler and eliminates the raw-mode attack surface entirely if PostHog's SNS subscriptions always use envelope delivery. If raw delivery is intentionally supported, use Option B.

Severity: medium | Confidence: 78%

)

const eventName = METRIC_NAME_TO_EVENT_NAME[metricName]
if (eventName && distinctId && (await this.teamWorkflowsConfigService.shouldCaptureEngagementEvents(teamId))) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Forged engagement events

/public/m/ses_webhook accepts raw SES records without SNS signature verification, and this new block turns those records into captured events. An attacker who has or can construct a ph_id for a workflow can POST a raw Open/Click/Bounce record and insert $messaging_email_* events for an arbitrary distinctId in that team; only emit captured events after a verified SNS envelope, or add integrity protection to the tracking code and reject unsigned raw delivery on the public endpoint.

@veria-ai

veria-ai Bot commented May 27, 2026

Copy link
Copy Markdown

PR overview

The PR has one open security issue: the public SES webhook path can accept unsigned/raw engagement records and convert them into messaging engagement events. If an attacker can obtain or construct a valid tracking identifier, they could forge open/click/bounce activity for a chosen user within a team, which affects event integrity rather than directly exposing data or bypassing authentication. No issues have been addressed yet, so the security posture still depends on adding SNS signature verification or equivalent integrity checks before emitting these events.

Open issues (1)

Fixed/addressed: 0 · PR risk: 6/10

dmarchuk added a commit that referenced this pull request May 27, 2026
Append a 16-byte HMAC-SHA256 signature (keyed by ENCRYPTION_SALT_KEYS) to the
ph_id tracking code embedded in workflow email pixels, redirect links, and SES
EmailTags. Format becomes <base64-payload>.<base64-signature>; the '.' separator
disambiguates signed and unsigned codes (base64url alphabet does not include '.',
so legacy codes are never mistaken for signed).

Backwards-compatible by design: codes without a separator parse as legacy
unsigned so in-flight emails at deploy time still resolve. The new
email_tracking_code_format_total{format,source} counter is the rollout signal —
once format=unsigned decays to zero on both source=ses and source=tracking it is
safe to follow up with a parser change that rejects unsigned codes outright.

ENCRYPTION_SALT_KEYS is read comma-separated: codes are signed with the first
key and verified against any, so rotation is a config-only operation. If no
keys are configured the generator emits unsigned codes — same behaviour as
today, so dev environments keep working without setup.

Closes the forgeable-ph_id surface that #60212 would otherwise amplify: with
signed codes an attacker can no longer mint a tracking identifier with arbitrary
distinct_id, teamId, or link_url. Replay of legitimately-issued codes is still
possible until the unsigned path is closed, but the blast radius drops from
'inject arbitrary engagement events for any team' to 'refire events for an
email that was actually sent.'
dmarchuk added a commit that referenced this pull request May 28, 2026
Append a 16-byte HMAC-SHA256 signature (keyed by ENCRYPTION_SALT_KEYS) to the
ph_id tracking code embedded in workflow email pixels, redirect links, and SES
EmailTags. Format becomes <base64-payload>.<base64-signature>; the '.' separator
disambiguates signed and unsigned codes (base64url alphabet does not include '.',
so legacy codes are never mistaken for signed).

Backwards-compatible by design: codes without a separator parse as legacy
unsigned so in-flight emails at deploy time still resolve. The new
email_tracking_code_format_total{format,source} counter is the rollout signal —
once format=unsigned decays to zero on both source=ses and source=tracking it is
safe to follow up with a parser change that rejects unsigned codes outright.

ENCRYPTION_SALT_KEYS is read comma-separated: codes are signed with the first
key and verified against any, so rotation is a config-only operation. If no
keys are configured the generator emits unsigned codes — same behaviour as
today, so dev environments keep working without setup.

Closes the forgeable-ph_id surface that #60212 would otherwise amplify: with
signed codes an attacker can no longer mint a tracking identifier with arbitrary
distinct_id, teamId, or link_url. Replay of legitimately-issued codes is still
possible until the unsigned path is closed, but the blast radius drops from
'inject arbitrary engagement events for any team' to 'refire events for an
email that was actually sent.'
dmarchuk added a commit that referenced this pull request May 28, 2026
Append a 16-byte HMAC-SHA256 signature (keyed by ENCRYPTION_SALT_KEYS) to the
ph_id tracking code embedded in workflow email pixels, redirect links, and SES
EmailTags. Format becomes <base64-payload>.<base64-signature>; the '.' separator
disambiguates signed and unsigned codes (base64url alphabet does not include '.',
so legacy codes are never mistaken for signed).

Backwards-compatible by design: codes without a separator parse as legacy
unsigned so in-flight emails at deploy time still resolve. The new
email_tracking_code_format_total{format,source} counter is the rollout signal —
once format=unsigned decays to zero on both source=ses and source=tracking it is
safe to follow up with a parser change that rejects unsigned codes outright.

ENCRYPTION_SALT_KEYS is read comma-separated: codes are signed with the first
key and verified against any, so rotation is a config-only operation. If no
keys are configured the generator emits unsigned codes — same behaviour as
today, so dev environments keep working without setup.

Closes the forgeable-ph_id surface that #60212 would otherwise amplify: with
signed codes an attacker can no longer mint a tracking identifier with arbitrary
distinct_id, teamId, or link_url. Replay of legitimately-issued codes is still
possible until the unsigned path is closed, but the blast radius drops from
'inject arbitrary engagement events for any team' to 'refire events for an
email that was actually sent.'
@dmarchuk dmarchuk marked this pull request as draft May 29, 2026 08:15
dmarchuk added a commit that referenced this pull request Jun 2, 2026
Append a 16-byte HMAC-SHA256 signature (keyed by ENCRYPTION_SALT_KEYS) to the
ph_id tracking code embedded in workflow email pixels, redirect links, and SES
EmailTags. Format becomes <base64-payload>.<base64-signature>; the '.' separator
disambiguates signed and unsigned codes (base64url alphabet does not include '.',
so legacy codes are never mistaken for signed).

Backwards-compatible by design: codes without a separator parse as legacy
unsigned so in-flight emails at deploy time still resolve. The new
email_tracking_code_format_total{format,source} counter is the rollout signal —
once format=unsigned decays to zero on both source=ses and source=tracking it is
safe to follow up with a parser change that rejects unsigned codes outright.

ENCRYPTION_SALT_KEYS is read comma-separated: codes are signed with the first
key and verified against any, so rotation is a config-only operation. If no
keys are configured the generator emits unsigned codes — same behaviour as
today, so dev environments keep working without setup.

Closes the forgeable-ph_id surface that #60212 would otherwise amplify: with
signed codes an attacker can no longer mint a tracking identifier with arbitrary
distinct_id, teamId, or link_url. Replay of legitimately-issued codes is still
possible until the unsigned path is closed, but the blast radius drops from
'inject arbitrary engagement events for any team' to 'refire events for an
email that was actually sent.'
@dmarchuk dmarchuk changed the title feat(workflows): emit messaging engagement events from plugin-server feat(workflows): emit workflows engagement events from plugin-server Jun 4, 2026
dmarchuk added 3 commits June 5, 2026 12:41
…hog-events-plugin-server

# Conflicts:
#	nodejs/src/cdp/services/messaging/email-tracking.service.test.ts
#	nodejs/src/cdp/services/messaging/email-tracking.service.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant