feat(workflows): HMAC-sign email tracking codes#60229
Conversation
f1c4f1f to
a7681ae
Compare
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nodejs/src/cdp/services/messaging/helpers/tracking-code.ts:49
The `candidate.length === expected.length` guard inside the loop is always true: `expected.length` was just validated to equal `SIGNATURE_BYTES` on line 44, and `candidate` is always `SIGNATURE_BYTES` bytes because `.subarray(0, SIGNATURE_BYTES)` on a 32-byte SHA-256 digest always yields exactly 16 bytes. The superfluous comparison can be removed to keep the condition tighter and self-documenting.
```suggestion
if (crypto.timingSafeEqual(expected, candidate)) {
```
### Issue 2 of 2
nodejs/src/cdp/services/messaging/helpers/tracking-code.ts:80-81
Codes with more than one `.` (e.g., `<payload>.<sig>.<garbage>`) are silently accepted if the first two segments verify, because `split('.')` returns all parts and destructuring silently drops the rest. Both the payload and the signature are base64url (no dots), so a legitimate signed code always has exactly one `.`. Rejecting codes with extra dots would prevent surprising behaviour with malformed inputs.
```suggestion
const parts = encodedTrackingCode.split('.')
if (parts.length > 2) {
return null
}
const [payload, signature] = parts
if (signature !== undefined) {
```
Reviews (1): Last reviewed commit: "feat(workflows): HMAC-sign email trackin..." | Re-trigger Greptile |
PR overviewThis pull request adds HMAC signing for email tracking codes used by workflow messaging, updating the email service’s tracking-code generation and validation behavior. The change is focused on making workflow email tracking data tamper-evident while preserving the existing tracking flow. There is one open security concern remaining: the signed tracking header still contains decodable workflow identifiers, so an email recipient could recover IDs and use them against matching public workflow webhook endpoints. This leaves a concrete attacker action with a limited but real blast radius around exposed workflow invocations. Three earlier issues have already been addressed, so the security posture has improved, but this remaining identifier exposure should still be resolved before considering the PR fully hardened. Open issues (1)
Fixed/addressed: 3 · PR risk: 6/10 |
Append a 16-byte HMAC-SHA256 signature (keyed by ENCRYPTION_SALT_KEYS) to the
ph_id tracking code embedded in workflow email pixels, redirect links, and SES
EmailTags. Format becomes <base64-payload>.<base64-signature>; the '.' separator
disambiguates signed and unsigned codes (base64url alphabet does not include '.',
so legacy codes are never mistaken for signed).
Backwards-compatible by design: codes without a separator parse as legacy
unsigned so in-flight emails at deploy time still resolve. The new
email_tracking_code_format_total{format,source} counter is the rollout signal —
once format=unsigned decays to zero on both source=ses and source=tracking it is
safe to follow up with a parser change that rejects unsigned codes outright.
ENCRYPTION_SALT_KEYS is read comma-separated: codes are signed with the first
key and verified against any, so rotation is a config-only operation. If no
keys are configured the generator emits unsigned codes — same behaviour as
today, so dev environments keep working without setup.
Closes the forgeable-ph_id surface that #60212 would otherwise amplify: with
signed codes an attacker can no longer mint a tracking identifier with arbitrary
distinct_id, teamId, or link_url. Replay of legitimately-issued codes is still
possible until the unsigned path is closed, but the blast radius drops from
'inject arbitrary engagement events for any team' to 'refire events for an
email that was actually sent.'
- drop the always-true `candidate.length === expected.length` guard in verifySignature — the early SIGNATURE_BYTES check upstream already proves both buffers are exactly 16 bytes (per Greptile) - reject tracking codes containing extra `.` segments — a legitimate signed code is structurally `<payload>.<signature>`, both halves are base64url and contain no dots, so any extra dot is malformed input (per Greptile) - new regression test covers the extra-dot rejection
71a5017 to
b959b72
Compare
… as a wire-contract guard
…king-hmac-codes # Conflicts: # nodejs/src/cdp/services/messaging/email.service.ts
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nodejs/src/cdp/services/messaging/helpers/tracking-code.ts:116-130
The payload-building block (`actionId`, `parentRunId`, `toBase64UrlSafe(...)`) is identical in both `generateEmailTrackingCode` and `generateShortEmailTrackingCode`, violating the "OnceAndOnlyOnce" simplicity rule. `generateEmailTrackingCode` can simply delegate to `generateShortEmailTrackingCode` for the payload and then conditionally append the signature, eliminating the duplication.
```suggestion
// Full tracking code, HMAC-signed when a signing key is configured. Rides in the custom MIME
// header and the pixel/link URLs — carriers with no length cap — and the signature lets the
// public tracking endpoint reject forged `ph_id` values.
export const generateEmailTrackingCode = (invocation: TrackingInvocation): string => {
const payload = generateShortEmailTrackingCode(invocation)
const keys = getSigningKeys()
if (keys.length === 0) {
return payload
}
return `${payload}.${signPayload(payload, keys[0])}`
}
```
Reviews (2): Last reviewed commit: "Merge remote-tracking branch 'origin/mas..." | Re-trigger Greptile |
| function getSigningKeys(): string[] { | ||
| return (defaultConfig.ENCRYPTION_SALT_KEYS || '').split(',').filter(Boolean) | ||
| } |
There was a problem hiding this comment.
I wonder if we overall should convert this to a class (e.g., EmailTrackingCodeSigner) so that it can be injected by cdp-services.ts the same way we do it for RecipientTokensService. So we would be parsing keys once in the constructor and thats it. Also for Tests we could then pass keys directly.
Just saw that the tests mutate defaultConfig.ENCRYPTION_SALT_KEYS directly a with try/finally restore.. thats okay but kinda fragile and kind of a symptom of the coupling problem. But in the end its your call.
There was a problem hiding this comment.
I like it, would definitely make it cleaner 👍 I'll just send that as a follow up PR after this one is merged, I think it will be a bit easier to review and make sure everything works as expected, plus the current state has been already heavily tested
| const keys = getSigningKeys() | ||
| if (keys.length === 0) { | ||
| return payload | ||
| } | ||
| return `${payload}.${signPayload(payload, keys[0])}` | ||
| } |
There was a problem hiding this comment.
If ENCRYPTION_SALT_KEYS is not set in a deployment the code emits unsigned and everything keeps on working.. is that wanted?
I know we use email_tracking_code_format_total but now we need another alert for that and then it still takes a while until we can react and fix the problem.
maybe for now thats oka but after the rollout completes and you flip a feature flag we should escalate to throwing.. might be worth noting that somewhere to do as fup?
There was a problem hiding this comment.
For now this is wanted, I wanted to make sure we just keep sending, watch the metrics and only switch after some time. Good note about documenting the follow up though, I only had a to-do and reminder on Slack for myself, I've added #62624 now too
| // 256-char-bounded the way SES tag values are, so they fit the signed code. The | ||
| // configuration set's event destination needs `IncludeOriginalHeaders: true` for the | ||
| // webhook to surface this header. | ||
| const trackingHeader: MessageHeader = { Name: TRACKING_CODE_HEADER_NAME, Value: trackingCode } |
There was a problem hiding this comment.
Medium: Tracking header exposes workflow identifiers
X-PostHog-Tracking-Code is delivered with the email, and the HMAC only authenticates the base64 payload; it does not hide functionId/teamId/invocationId. A recipient of a text-only or transactional workflow email can decode this header, recover the workflow/function ID, and invoke matching public workflow webhook endpoints such as /public/webhooks/:id; use an opaque or encrypted carrier for recipient-visible headers instead of embedding the raw IDs.
Summary
HMAC-signs the email tracking code (
ph_id) so the public tracking endpoint can reject forged codes, and carries the signed code in a custom MIME header rather than the SESEmailTagsvalue — because a signed code exceeds the 256-char tag limit for batch sends at production scale.No distinct_id, no engagement-events consumer — this PR is purely the signing + carrier infrastructure. The engagement-events consumer that layers distinct_id on top lives in #60212.
Why the header carrier (the core of this PR)
The original approach put the signed code straight into the SES
EmailTagsph_idvalue. That overflows the 256-char tag-value limit for batch sends with production-sized team IDs. Measured from this branch's code (UUID functionId/invocationId, standard 58-charaction_function_email_<uuid>step, batch parentRunId):1(local/dev)281194(6-digit prod)So signing-into-the-tag breaks every batch email send for normal production teams. The fix splits the carrier:
X-PostHog-Tracking-Code+ the pixel/link URLs. Header values aren't 256-bounded, so the signed code fits regardless of team_id (or, later, distinct_id length).functionId:invocationId:teamId:actionId:parentRunId, no signature) → SESEmailTagsph_id. Measured at 235 chars for a 6-digit team — comfortably under 256. Kept purely as a backwards-compat fallback; it doesn't need its own signature because the tag arrives via the SNS webhook, which is already integrity-protected by SNS signing.The webhook parser reads the header first and falls back to the tag (for in-flight messages sent before this deployed, or regions where the configuration set hasn't enabled original headers yet).
What this PR does
generateEmailTrackingCodeappends an HMAC-SHA256 tag (16 bytes / base64url) when a signing key is configured, producing<payload>.<signature>. Keyed byENCRYPTION_SALT_KEYS; verification tries every configured key so the signing key can be rotated without invalidating in-flight codes.email.service.tswrites the signed code toX-PostHog-Tracking-Codeand the short unsigned code to the SES tag.formatdiscriminator.ses.tsreads header-first, tag-fallback.email_tracking_code_format_total{format, source}to watch the unsigned→signed curve and catch any env missing the signing key.Test plan
Tag value cannot exceed 256 characters.email_tracking_code_format_total{format="signed",source="ses"}increments, zerounsigned; confirmed by sending with the tag write disabled and tracking still firing.Operational notes (for deploy)
ENCRYPTION_SALT_KEYSmust be set in every plugin-server deployment — without it, codes emit unsigned (sends/tracking still work, but signing is silently off; theformatmetric surfaces this).IncludeOriginalHeaders: trueso the webhook surfaces the header; until then the webhook falls back to the short tag.Coordination
🤖 Generated with Claude Code