Prospector

Inspiration

Student engineering teams spend huge amounts of time chasing support, sponsors, and technical mentors. Most of that process is manual, scattered across docs, Discord messages, and old sponsor lists. We built Prospector to turn that messy process into a clear workflow: understand a team’s real blockers, then find and explain the best external organizations to contact.

What it does

Prospector is a Discord-first assistant for design teams.

It can:

Ingest team context from GitHub, websites, Notion, Confluence, and Discord channels
Extract structured team needs (tech stack, focus areas, blockers, support needs)
Rank relevant companies and providers using retrieval and semantic matching
Explain why a specific organization is a strong fit
Generate outreach email drafts and send them through Gmail
Keep team membership and active-team context per user inside a server

How we built it

We built Prospector as a unified Python system with FastAPI + Discord bot in one runtime.

Core pieces:

Discord layer: Slash commands, embeds, buttons, and thread-based chat
Ingestion layer: Source connectors for GitHub, web pages, Notion, Confluence, and Discord
Context extraction: LLM-based synthesis into team-level structured context
Retrieval layer: Ranking APIs with semantic scoring, filtering, and explanation metadata
Scraper pipeline: Entity gathering, page scraping, enrichment, and embedding generation
Data layer: Supabase tables + vector search RPC for semantic retrieval

We optimized for iterative ingestion with content hashing, so teams can re-index without duplicating memory.

Challenges we ran into

Keeping a consistent data shape across ingestion, extraction, retrieval, and Discord UX
Handling noisy and inconsistent content from different sources
Making retrieval useful when DB data is sparse or RPC results vary in shape
Balancing model-powered ranking with deterministic fallback logic
Managing long-running pipeline steps while still providing responsive command interactions
Avoiding stale context when users switch active teams in multi-team servers

Accomplishments that we're proud of

End-to-end flow from raw team data to ranked support matches in Discord
Multi-source ingestion with incremental updates instead of full re-writes
Match explanations that are actionable, not just a score
Functional outreach loop: find match -> explain fit -> draft email -> send
A modular architecture where each layer can be tested and improved independently

What we learned

Retrieval quality depends as much on clean source normalization as on model choice
Team context extraction needs strict schema discipline to stay usable downstream
Hybrid systems are stronger: LLM flexibility + deterministic fallback gives resilience
Developer velocity improves a lot when command UX and backend contracts are designed together
Good metadata and observability are critical for debugging ranking behavior

What's next for Prospector

Better ranking quality with richer weighting, reranking, and feedback loops
Background job queue for heavy ingestion and scraping tasks
Stronger evaluation harness with golden datasets and regression tests
Team memory governance features (audit trail, approvals, chunk provenance)
More CRM-like workflow support (contact history, follow-up reminders, campaign tracking)
Deployment hardening for production use across many student teams