Skip to content

Commitment extraction from user-authored content #475

@jayzalowitz

Description

@jayzalowitz

Commitment extraction from user-authored content

Context

The most useful to-dos a reference AI-inbox product surfaces are the ones it pulls
from the user's own words in sent mail — phrasings like "I'll get you that by
Friday" or "I can take care of X." These are self-imposed obligations the user
already committed to in writing, and they are the highest-signal to-dos because the
user authored them. SkyTwin already knows which content the user authored — the
AuthoringTier system (#251) flags user_sent_originated and user_sent_reply
but it classifies by mail headers only and never reads the body for stated
commitments. This is the single highest-leverage gap: the infrastructure to find
authored content exists; what's missing is reading it.

Current State

Verified 2026-06-06.

  • packages/connectors/src/authoring-tier.ts:17-23AuthoringTier union with six
    tiers; user_sent_originated and user_sent_reply mark user-authored mail.
  • packages/connectors/src/authoring-tier.ts:25-40EmailAuthoringInputs is
    header/label-shaped only: labels, fromAddress, toAddresses, ccAddresses,
    hasInReplyTo, hasListUnsubscribe, listId. No body field — the classifier
    never sees message text.
  • packages/connectors/src/gmail-connector.ts — stamps the tier on each signal at
    ingest.
  • packages/decision-engine/src/situation-interpreter.ts:73-84deriveProvenance
    reads authoringTier off the event to set ActionProvenance. So the tier already
    flows downstream; nothing consumes the body of authored mail for commitments.
  • There is no commitment / obligation / promise extractor anywhere in the repo
    (searched decision-engine, twin-model, connectors).

Proposed Change

Add a commitment extractor that runs over the body of user_sent_originated and
user_sent_reply signals, identifies first-person future-tense obligations, and
emits them as actionable to-do candidates with provenance user_originated.

A commitment is a sentence where the author (the user) commits to a future action.
Synthetic examples (illustrative, not real data):

  • "I'll send over the draft tomorrow." → to-do: Send the draft (deadline:
    tomorrow)
  • "I can reach out to them this week." → to-do: Reach out to them (deadline: this
    week)
  • "Let me pull those numbers and circle back." → to-do: Pull the numbers and
    circle back

Non-commitments to exclude: questions ("Can you send it?"), statements about others
("She'll handle it"), past tense ("I sent it"), hypotheticals ("I would if...").

Implementation Details

Prerequisite / start order (flagged in review): this extractor consumes
SignalText from spec 07, which does not exist yet. Do NOT start this before spec 07
lands — OR, if you need to start sooner, build it against an email-only input
({ authoringTier, subject, body, sentAt, threadId, recipients }) and refactor to
SignalText when 07 lands. Pick one explicitly before coding; don't half-assume
SignalText exists.

  1. New module packages/decision-engine/src/commitment-extractor.ts, exporting
    extractCommitments(input: SignalText): Commitment[]. Pure, side-effect-free
    (mirrors authoring-tier.ts's testability contract). Input is the normalized
    SignalText from spec 07 — NOT an email-specific shape.
    It runs on any authored
    channel: sent mail, calendar event descriptions the user wrote, and transcribed
    voice notes (a strong source — the user literally says "I'll do X"). See spec 07's
    coverage matrix for which sources are in/out.
    // SignalText (from spec 07) carries: source, title, body, authoringTier,
    // authoredByUser, occurredAt, participants — the channel-agnostic accessor.
    export interface Commitment {
      text: string;            // normalized imperative ("Send the draft")
      rawSpan: string;         // the source sentence, for explanation/citation
      deadlineHint: string | null;  // raw phrase ("tomorrow") — resolved by spec 03
      committedTo: string[];   // recipients
      confidence: number;      // 0..1
    }
  2. Gating — only run when SignalText.authoredByUser === true (i.e.
    authoringTier ∈ {user_sent_originated, user_sent_reply, authored_originated} per
    spec 07). Return [] otherwise, AND [] for any source not in spec 07's
    commitment row (e.g. filesystem). This is the security boundary: we extract
    commitments only from content the user authored, never from inbound/received
    content (inbound "you agreed to X" is a poisoning vector — see safety invariant Live notification layer: SSE, approval expiry cron, push alerts #8).
  3. Extraction strategy — two-tier, matching the package's LLM-strategy + rule
    fallback pattern:
    • LLM strategy: a versioned policy-prompts template
      commitment-extraction/v1.md with JSON-schema-validated output (array of
      Commitment), deterministic fallback to the rule path.
    • Rule fallback: first-person future-modal regex set (I'll, I will,
      I can, I'm going to, let me, I'll make sure) anchored to sentence start,
      excluding interrogatives and past tense. Lower recall, zero cost, always
      available.
  4. Pipeline wiringcommitment-extractor runs in the worker's signal
    processing path for authored signals; each Commitment becomes a to-do candidate
    carrying ActionProvenance = user_originated and feeds spec 01's To-dos bucket.
    deadlineHint is handed to spec 03's resolver to populate the deadline field.
  5. Explanation — each emitted to-do produces an ExplanationRecord citing
    rawSpan and threadId (safety invariant Make SkyTwin usable by a non-technical person end-to-end #2: every action has an explanation;
    the evidence here is the user's own sentence).
  6. Dedup — a commitment restated across thread replies must collapse to one
    to-do. Key on (threadId, normalized text).

Acceptance Criteria

  1. Given a user_sent_originated body containing two distinct first-person future
    commitments, the extractor returns exactly two Commitment objects, each with a
    non-empty rawSpan.
  2. Given an inbox_personal (received, not authored) body with identical phrasing,
    the extractor returns [] (gating enforced).
  3. Questions, past-tense statements, and third-party statements are not extracted
    (≥5 negative fixtures pass).
  4. A commitment restated in a later reply in the same thread collapses to one to-do.
  5. Each emitted to-do has ActionProvenance = user_originated and an
    ExplanationRecord citing the source sentence.
  6. With no LLM configured, the rule fallback runs and returns deterministic results.
  7. deadlineHint carries the raw temporal phrase when present, null otherwise.
  8. Tests written and passing. No degradation of existing functionality.

Testing Plan

Layer What Count
Unit Positive extraction (modal forms), synthetic bodies +6
Unit Negative cases: questions, past tense, third-party, hypothetical +5
Unit Tier gating — non-authored tiers return [] +3
Unit Thread-restatement dedup +2
Integration Authored signal → extractor → to-do candidate with provenance + explanation +2
Integration LLM strategy + deterministic fallback parity on a shared fixture set +2

Rollback Plan

Feature-flag the extractor invocation in the worker (COMMITMENT_EXTRACTION=off).
Disabling it stops emitting commitment to-dos; no schema changes to reverse. The
module is pure and unreferenced when the flag is off.

Effort Estimate

  • Extractor module + types: ~3h
  • Rule fallback + regex set: ~2h
  • LLM prompt template + schema validation: ~3h
  • Pipeline wiring + explanation records: ~3h
  • Dedup: ~1h
  • Tests: ~4h

Total: ~2 days.

Files Reference

File Change
packages/decision-engine/src/commitment-extractor.ts New: extractor + types
packages/policy-prompts/prompts/commitment-extraction/v1.md New: LLM template + schema
packages/connectors/src/authoring-tier.ts Reference only (tier gating); no change
worker signal-processing path Invoke extractor for authored signals
packages/explanations/* Emit ExplanationRecord citing rawSpan

Out of Scope

  • Resolving deadlineHint to an absolute date — spec 03 owns that.
  • Tracking commitment completion (did the user actually do it). Future work.
  • Channels with no connector yet (chat/MCP). Spec 07's matrix reserves the slot;
    this extractor inherits them for free once a connector lands.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions