Inspiration

I'm a job seeker. I've been on both sides of the cold-outreach death loop: spending an hour hand-researching one person, or blasting templates that get deleted in the 0.1 seconds a recruiter spends before archiving. The data is brutal — generic cold outreach replies at ~3.4%, while evidence-grounded, signal-based outreach replies at 15–25%. A ~5× lift, and no tool optimizes for it. Every outreach product on the market optimizes for volume and fakes personalization with token-insertion. I wanted the opposite: a system whose headline feature is the message it refuses to send — because the message you don't send protects the ones you do.

What it does

The Relevance Gate is a four-agent crew for job-hunt outreach that won't write a message unless it finds a real, evidence-backed reason to contact someone.

  • Paste your résumé (or any public URL) → an agent distills it into your profile: proof points and a strict relevance rule derived from your career goal.
  • Find targets → live web search surfaces real people publicly hiring or building in your specialty — every lead cited to its source.
  • Run the gate on anyone → RESEARCH builds a sourced dossier (their words, their page, opt-in web search) → INTERSECT computes the genuine overlap between your proven work and their world, evidenced both ways → THE GATE independently scores it 0–100 and decides → WRITE runs only if cleared: ≤75 words, opens on the strongest true overlap, one low-friction ask, every claim traceable.

When there's no real reason, you get a designed refusal: "No message. On purpose." A backend engineer who "thinks AI is neat" scores 0 — generic interest is not a reason, and the system knows it. Every verdict is logged to ClickHouse, so "it refuses" isn't a claim — it's a queryable block-rate.

How we built it

  • Claude (Anthropic) is the brain of all four agents — Haiku 4.5 for the fast structured-judgment stages (research, intersect, verify), Sonnet 4.6 for the one artifact a human reads aloud: the message. Every call uses schema-enforced structured outputs — no parse roulette.
  • The architectural core: retrieval, judgment, and generation are isolated agents. The gate renders its verdict before the writer exists, so the writer can never talk the gate into sending — and code-level enforcement means the writer physically never runs on a weak overlap, regardless of what any model says.
  • ClickHouse logs every verdict (score, send/hold, overlap strength) with a live block-rate analytics endpoint. Render hosts it. The frontend is one hand-set HTML page — editorial letterpress design, SSE-streamed stage ledger, a rotated CLEARED/HELD stamp.
  • Reliability engineering: per-stage timeouts, a fail-closed verifier (no verdict = no send), no raw error can reach the screen, and a clearly-labeled DEMO_MODE safety net that renders a pre-validated result if the API dies mid-pitch. A preflight.py script verifies the three canonical demo cases in one command.

Challenges we ran into

  • The latency wall. Sonnet on every stage was correct but took ~28s — more than double our 12-second budget. We profiled per-stage, moved the structured-judgment stages to Haiku, capped output tokens, and got the verdict landing at ~10s without losing a single preflight case.
  • The citation parser bug. Web search results arrive split across many text blocks (citations fragment the response), so our line-based fact parser silently returned zero facts for Dario Amodei. Joining blocks and splitting on tokens fixed it — 5 sourced facts in 5 seconds.
  • Schema-valid garbage. A JSON object inside a string field is still schema-valid — Haiku occasionally double-encoded the message. We added an unwrap guard, because "the schema passed" isn't the same as "the output is right."
  • LinkedIn's wall. Login-walled sites block server fetches. Instead of faking it, we made the system honest: it tells you to paste the text, and the web-search path surfaces publicly-indexed facts instead.

Accomplishments that we're proud of

  • An agent that says no — reliably. Preflight passes 3/3 against the live API: a genuine overlap clears at 85, a merely-generic target holds at 0, an irrelevant one holds at 15.
  • The fail-closed property. Every failure mode — timeout, parse error, network death — resolves to not sending. That's the only safe direction for an anti-spam system, and we proved it by killing the pipeline mid-run.
  • The full loop works end to end: résumé → profile → real discovered leads (it found an open Microsoft Agentic Experiences PM role, with source) → research → score → message.
  • A frontend that doesn't look AI-generated, because it wasn't templated — hand-set serif typography, a stage ledger, and a refusal state designed to be the hero moment.

What we learned

  • Independence beats prompting. You can't prompt a single model into being its own honest judge — separating the gate from the writer, and enforcing it in code, is what makes the anti-hallucination property real.
  • Latency budgets are product decisions. Matching model tier to stage (fast models for judgment, the strong model for the human-facing artifact) beat using the best model everywhere.
  • Schema enforcement ≠ correctness — validate the semantics, not just the shape.
  • Reliability engineering (timeouts, fail-closed defaults, seeded fallbacks, a 30-second preflight) is what makes a live demo survivable — it's not polish, it's the feature.

What's next for The Relevance Gate

  • Deeper enrichment: Composio-powered integrations for richer target research, and Pioneer for agent observability across every gate decision.
  • Send + follow-up loop: draft-to-email handoff, reply tracking, and gate-calibration from real reply outcomes — does a score of 85 actually reply 5× more than a 60?
  • The real company: today is the wedge — a job seeker's outreach tool. The vision is a relevance layer: an API any outreach product can plug into, where the gate sits between every CRM and every send button. Found many. Contact few. All of them true.

Built With

  • clickhouse
  • guild.ai
  • render
Share this project:

Updates