You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The most useful to-dos a reference AI-inbox product surfaces are the ones it pulls
from the user's own words in sent mail — phrasings like "I'll get you that by
Friday" or "I can take care of X." These are self-imposed obligations the user
already committed to in writing, and they are the highest-signal to-dos because the
user authored them. SkyTwin already knows which content the user authored — the AuthoringTier system (#251) flags user_sent_originated and user_sent_reply —
but it classifies by mail headers only and never reads the body for stated
commitments. This is the single highest-leverage gap: the infrastructure to find
authored content exists; what's missing is reading it.
Current State
Verified 2026-06-06.
packages/connectors/src/authoring-tier.ts:17-23 — AuthoringTier union with six
tiers; user_sent_originated and user_sent_reply mark user-authored mail.
packages/connectors/src/authoring-tier.ts:25-40 — EmailAuthoringInputs is
header/label-shaped only: labels, fromAddress, toAddresses, ccAddresses, hasInReplyTo, hasListUnsubscribe, listId. No body field — the classifier
never sees message text.
packages/connectors/src/gmail-connector.ts — stamps the tier on each signal at
ingest.
packages/decision-engine/src/situation-interpreter.ts:73-84 — deriveProvenance
reads authoringTier off the event to set ActionProvenance. So the tier already
flows downstream; nothing consumes the body of authored mail for commitments.
There is no commitment / obligation / promise extractor anywhere in the repo
(searched decision-engine, twin-model, connectors).
Proposed Change
Add a commitment extractor that runs over the body of user_sent_originated and user_sent_reply signals, identifies first-person future-tense obligations, and
emits them as actionable to-do candidates with provenance user_originated.
A commitment is a sentence where the author (the user) commits to a future action.
Synthetic examples (illustrative, not real data):
"I'll send over the draft tomorrow." → to-do: Send the draft (deadline:
tomorrow)
"I can reach out to them this week." → to-do: Reach out to them (deadline: this
week)
"Let me pull those numbers and circle back." → to-do: Pull the numbers and
circle back
Non-commitments to exclude: questions ("Can you send it?"), statements about others
("She'll handle it"), past tense ("I sent it"), hypotheticals ("I would if...").
Implementation Details
Prerequisite / start order (flagged in review): this extractor consumes SignalText from spec 07, which does not exist yet. Do NOT start this before spec 07
lands — OR, if you need to start sooner, build it against an email-only input
({ authoringTier, subject, body, sentAt, threadId, recipients }) and refactor to SignalText when 07 lands. Pick one explicitly before coding; don't half-assume SignalText exists.
New modulepackages/decision-engine/src/commitment-extractor.ts, exporting extractCommitments(input: SignalText): Commitment[]. Pure, side-effect-free
(mirrors authoring-tier.ts's testability contract). Input is the normalized SignalText from spec 07 — NOT an email-specific shape. It runs on any authored
channel: sent mail, calendar event descriptions the user wrote, and transcribed
voice notes (a strong source — the user literally says "I'll do X"). See spec 07's
coverage matrix for which sources are in/out.
// SignalText (from spec 07) carries: source, title, body, authoringTier,// authoredByUser, occurredAt, participants — the channel-agnostic accessor.exportinterfaceCommitment{text: string;// normalized imperative ("Send the draft")rawSpan: string;// the source sentence, for explanation/citationdeadlineHint: string|null;// raw phrase ("tomorrow") — resolved by spec 03committedTo: string[];// recipientsconfidence: number;// 0..1}
Gating — only run when SignalText.authoredByUser === true (i.e. authoringTier ∈ {user_sent_originated, user_sent_reply, authored_originated} per
spec 07). Return [] otherwise, AND [] for any source not in spec 07's
commitment row (e.g. filesystem). This is the security boundary: we extract
commitments only from content the user authored, never from inbound/received
content (inbound "you agreed to X" is a poisoning vector — see safety invariant Live notification layer: SSE, approval expiry cron, push alerts #8).
LLM strategy: a versioned policy-prompts template commitment-extraction/v1.md with JSON-schema-validated output (array of Commitment), deterministic fallback to the rule path.
Rule fallback: first-person future-modal regex set (I'll, I will, I can, I'm going to, let me, I'll make sure) anchored to sentence start,
excluding interrogatives and past tense. Lower recall, zero cost, always
available.
Pipeline wiring — commitment-extractor runs in the worker's signal
processing path for authored signals; each Commitment becomes a to-do candidate
carrying ActionProvenance = user_originated and feeds spec 01's To-dos bucket. deadlineHint is handed to spec 03's resolver to populate the deadline field.
Explanation — each emitted to-do produces an ExplanationRecord citing rawSpan and threadId (safety invariant Make SkyTwin usable by a non-technical person end-to-end #2: every action has an explanation;
the evidence here is the user's own sentence).
Dedup — a commitment restated across thread replies must collapse to one
to-do. Key on (threadId, normalized text).
Acceptance Criteria
Given a user_sent_originated body containing two distinct first-person future
commitments, the extractor returns exactly two Commitment objects, each with a
non-empty rawSpan.
Given an inbox_personal (received, not authored) body with identical phrasing,
the extractor returns [] (gating enforced).
Questions, past-tense statements, and third-party statements are not extracted
(≥5 negative fixtures pass).
A commitment restated in a later reply in the same thread collapses to one to-do.
Each emitted to-do has ActionProvenance = user_originated and an ExplanationRecord citing the source sentence.
With no LLM configured, the rule fallback runs and returns deterministic results.
deadlineHint carries the raw temporal phrase when present, null otherwise.
Tests written and passing. No degradation of existing functionality.
Negative cases: questions, past tense, third-party, hypothetical
+5
Unit
Tier gating — non-authored tiers return []
+3
Unit
Thread-restatement dedup
+2
Integration
Authored signal → extractor → to-do candidate with provenance + explanation
+2
Integration
LLM strategy + deterministic fallback parity on a shared fixture set
+2
Rollback Plan
Feature-flag the extractor invocation in the worker (COMMITMENT_EXTRACTION=off).
Disabling it stops emitting commitment to-dos; no schema changes to reverse. The
module is pure and unreferenced when the flag is off.
Commitment extraction from user-authored content
Context
The most useful to-dos a reference AI-inbox product surfaces are the ones it pulls
from the user's own words in sent mail — phrasings like "I'll get you that by
Friday" or "I can take care of X." These are self-imposed obligations the user
already committed to in writing, and they are the highest-signal to-dos because the
user authored them. SkyTwin already knows which content the user authored — the
AuthoringTiersystem (#251) flagsuser_sent_originatedanduser_sent_reply—but it classifies by mail headers only and never reads the body for stated
commitments. This is the single highest-leverage gap: the infrastructure to find
authored content exists; what's missing is reading it.
Current State
Verified 2026-06-06.
packages/connectors/src/authoring-tier.ts:17-23—AuthoringTierunion with sixtiers;
user_sent_originatedanduser_sent_replymark user-authored mail.packages/connectors/src/authoring-tier.ts:25-40—EmailAuthoringInputsisheader/label-shaped only:
labels,fromAddress,toAddresses,ccAddresses,hasInReplyTo,hasListUnsubscribe,listId. No body field — the classifiernever sees message text.
packages/connectors/src/gmail-connector.ts— stamps the tier on each signal atingest.
packages/decision-engine/src/situation-interpreter.ts:73-84—deriveProvenancereads
authoringTieroff the event to setActionProvenance. So the tier alreadyflows downstream; nothing consumes the body of authored mail for commitments.
(searched decision-engine, twin-model, connectors).
Proposed Change
Add a commitment extractor that runs over the body of
user_sent_originatedanduser_sent_replysignals, identifies first-person future-tense obligations, andemits them as actionable to-do candidates with provenance
user_originated.A commitment is a sentence where the author (the user) commits to a future action.
Synthetic examples (illustrative, not real data):
tomorrow)
week)
circle back
Non-commitments to exclude: questions ("Can you send it?"), statements about others
("She'll handle it"), past tense ("I sent it"), hypotheticals ("I would if...").
Implementation Details
Prerequisite / start order (flagged in review): this extractor consumes
SignalTextfrom spec 07, which does not exist yet. Do NOT start this before spec 07lands — OR, if you need to start sooner, build it against an email-only input
(
{ authoringTier, subject, body, sentAt, threadId, recipients }) and refactor toSignalTextwhen 07 lands. Pick one explicitly before coding; don't half-assumeSignalTextexists.packages/decision-engine/src/commitment-extractor.ts, exportingextractCommitments(input: SignalText): Commitment[]. Pure, side-effect-free(mirrors
authoring-tier.ts's testability contract). Input is the normalizedSignalTextfrom spec 07 — NOT an email-specific shape. It runs on any authoredchannel: sent mail, calendar event descriptions the user wrote, and transcribed
voice notes (a strong source — the user literally says "I'll do X"). See spec 07's
coverage matrix for which sources are in/out.
SignalText.authoredByUser === true(i.e.authoringTier ∈ {user_sent_originated, user_sent_reply, authored_originated}perspec 07). Return
[]otherwise, AND[]for any source not in spec 07'scommitment row (e.g. filesystem). This is the security boundary: we extract
commitments only from content the user authored, never from inbound/received
content (inbound "you agreed to X" is a poisoning vector — see safety invariant Live notification layer: SSE, approval expiry cron, push alerts #8).
fallback pattern:
policy-promptstemplatecommitment-extraction/v1.mdwith JSON-schema-validated output (array ofCommitment), deterministic fallback to the rule path.I'll,I will,I can,I'm going to,let me,I'll make sure) anchored to sentence start,excluding interrogatives and past tense. Lower recall, zero cost, always
available.
commitment-extractorruns in the worker's signalprocessing path for authored signals; each
Commitmentbecomes a to-do candidatecarrying
ActionProvenance = user_originatedand feeds spec 01's To-dos bucket.deadlineHintis handed to spec 03's resolver to populate thedeadlinefield.ExplanationRecordcitingrawSpanandthreadId(safety invariant Make SkyTwin usable by a non-technical person end-to-end #2: every action has an explanation;the evidence here is the user's own sentence).
to-do. Key on
(threadId, normalized text).Acceptance Criteria
user_sent_originatedbody containing two distinct first-person futurecommitments, the extractor returns exactly two
Commitmentobjects, each with anon-empty
rawSpan.inbox_personal(received, not authored) body with identical phrasing,the extractor returns
[](gating enforced).(≥5 negative fixtures pass).
ActionProvenance = user_originatedand anExplanationRecordciting the source sentence.deadlineHintcarries the raw temporal phrase when present,nullotherwise.Testing Plan
[]Rollback Plan
Feature-flag the extractor invocation in the worker (
COMMITMENT_EXTRACTION=off).Disabling it stops emitting commitment to-dos; no schema changes to reverse. The
module is pure and unreferenced when the flag is off.
Effort Estimate
Total: ~2 days.
Files Reference
packages/decision-engine/src/commitment-extractor.tspackages/policy-prompts/prompts/commitment-extraction/v1.mdpackages/connectors/src/authoring-tier.tspackages/explanations/*ExplanationRecordcitingrawSpanOut of Scope
deadlineHintto an absolute date — spec 03 owns that.this extractor inherits them for free once a connector lands.
Related
SignalText; multi-source from day one) andAuthoringTier(Memory bootstrap: weight user-sent emails higher than received #251).deadlineHintto spec 03.