Skip to content

feat: apr mcp — expose aprender as MCP server for agentic tools #863

@noahgift

Description

@noahgift

Motivation

Every agentic coding tool in early 2026 (Claude Code, Cursor, Cline, Aider, Continue) discovers capabilities via the Model Context Protocol. No local-inference framework — Ollama, llama.cpp, Unsloth — ships a first-party MCP server. Shipping apr mcp first occupies that slot and makes aprender the default local-inference backend for every MCP-speaking tool.

Competitive context (2026):

  • Aprender already: 1.43× Ollama decode perf, 57-command CLI, structured --json output on most commands
  • Missing: the ecosystem hook — no way for agentic tools to invoke aprender programmatically
  • Existing contracts already describe the intent: contracts/mcp-tool-schema-v1.yaml, contracts/pmcp/mcp-protocol-sdk-v1.yaml, contracts/apr-tool-rust-mcp-sdk-v1.yaml. The server itself is unbuilt.

Proposal

New subcommand apr mcp → stdio MCP server exposing 8 Phase-1 tools:

Tool Maps to
apr.run apr run — inference
apr.serve apr serve — OpenAI API server
apr.qa apr qa --json — 8 quality gates
apr.trace apr trace — layer-by-layer analysis
apr.tensors apr tensors --json — tensor inspection
apr.validate apr validate --quality — integrity
apr.bench apr bench — perf benchmarks
apr.finetune apr finetune — LoRA training (streaming)

Schemas generated from contracts/apr-cli-commands-v1.yaml at build time — no hand-maintained schemas.

New crate: crates/aprender-mcp/ using paiml/rust-mcp-sdk (pmcp).

Full spec: docs/specifications/apr-mcp-server-spec.md

Milestones

  • M1: Skeleton + initialize + tools/list (1 week)
  • M2: 8 tools as subprocess wrappers + schema generation (1 week)
  • M3: Streaming progress + cancellation (1 week)
  • M4: Claude Code dogfood + promote contract to ENFORCED (1 week)

Target: M1–M2 in v0.32.0, M3–M4 in v0.33.0.

Falsification Gates

Contract apr-mcp-server-v1.yaml (new) defines 8 falsifiable conditions:

  1. initialize response <500ms
  2. tools/list returns exactly 8 tools with valid JSONSchema Draft 7
  3. apr.run on qwen2.5-0.5b-q4km decodes "2" for prompt "1+1=" within 5s
  4. apr.qa output byte-identical to apr qa --json CLI
  5. Malformed JSON-RPC returns -32600, no crash
  6. notifications/cancelled stops decoding ≤30s, returns partial
  7. Protocol version mismatch rejected at initialize
  8. Generated schema byte-identical to contract-derived schema

Success Criteria

  • Claude Code / Cursor / Cline configured via .mcp.json can invoke apr.* tools and receive valid responses
  • All 8 falsification gates pass in CI
  • Zero regression on existing apr CLI commands

Non-goals (Phase 1)

  • resources/*, prompts/*, sampling protocols
  • Authentication / multi-tenant
  • Windows support (stdio needs separate testing)
  • In-process embedded mode (subprocess per call for v1)

Why this over alternatives

Scored against 7 candidate next tasks (declarative YAML training, GRPO, multimodal, torch.compile analog, MXFP4, FUSION-004 default-on, MCP server):

  • Perf (FUSION-004, MXFP4): closes 5–10% gap. Benchmark win, not adoption win.
  • Training features (GRPO, YAML): expand who CAN use aprender. MCP multiplies the CONTEXTS where aprender is invoked — 10–100× leverage because every agentic coding session becomes a potential invocation.
  • torch.compile / multimodal: too large for single-PR scope.
  • MCP: lowest code cost (wrap existing CLI), highest reach, unique competitive position.

cc @noahgift

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions