Inspiration

Developers spend 80% of their time reading code, yet documentation is almost always outdated or non-existent. Onboarding onto a new project usually feels like digging through thousands of lines of code just to find a single entry point. We built DocBot to be the technical GPS for your codebase—autonomously mapping out system architecture in seconds, not days.

What it does

DocBot is an autonomous documentation engine that handles the entire lifecycle—from initial discovery to real-time git integration.

  • Interactive Architecture Maps: Dynamically renders a whiteboard-style graph of your entire system. It’s not just a static image; it's an interactive workspace you can actually chat with.
  • Autonomous Explorer Agents: Specialized sub-agents that perform deep AST-based investigation, tracing dependencies and uncovering the "heart" of your logic while you code.
  • Git-Integrated Workflow: Includes docbot status to track documentation lag and docbot update for lightning-fast incremental changes, keeping docs in perfect sync with your git history.
  • Semantic Diffing: Use docbot diff to compare architecture snapshots and see exactly how your system's logical boundaries are evolving over time.
  • Live Pipeline Telemetry: A built-in telemetry server that provides a real-time, high-fidelity view of the agentic exploration process as it happens.

How we built it

We engineered a high-performance Map-Reduce Pipeline optimized for codebase scale:

  1. Scanner: Multi-language discovery using Tree-sitter (supporting Go, Rust, Swift, and 10+ others) with LLM fallback for universal coverage.
  2. Planner: A cost-aware orchestrator that groups code into logical "scopes" based on dependency density and context limits.
  3. Explorer (The Agents): A recursive agent system powered by Backboard.io and LangGraph. Root agents delegate investigations to specialized sub-agents that pull relevant AST snippets JIT.
  4. Reducer/Diff Engine: Merges fragmented insights into a unified architectural index and computes semantic differentials between snapshots.
  5. Frontend: A React-powered dashboard using React Flow for interactive graph traversal and real-time streaming updates.

Challenges we ran into

  • The Context Window Wall: Repositories are orders of magnitude larger than LLM windows. We solved this with Recursive Delegation, where agents independently decide when a code path is complex enough to require spawning a specialized sub-agent.
  • Language Agnosticism: Mapping architectural "edges" between disparate languages (e.g., a Go backend and a TS frontend) required a unified metadata layer that abstracts away the differences between raw ASTs.

Accomplishments that we're proud of

  • True Agentic Autonomy: Watching "root" agents independently decide to spawn sub-agents to "investigate" complex modules feels like magic.
  • Full Lifecycle Coverage: We didn't just build a scanner; we built a tool that integrates into the developer's daily workflow via git hooks.
  • Zero-Config Deployment: Works instantly on any repository without requiring manual configuration or training.

What we learned

  • Semantic AST extraction combined with agentic reasoning is 10x more powerful for code understanding than standard RAG.
  • Automation is mandatory: Documentation that isn't auto-generated on commit is documentation that is already dying.

What's next for DocBot

  • CI/CD Integration: Architecture diffs and "impact reports" as PR comments.
  • Cloud Explorer: One-click, browser-based exploration for any public GitHub repository.
  • Self-Healing Documentation: Proactive re-indexing triggered by file-system events to ensure zero-lag accuracy.

Built With

Share this project:

Updates