Motivation
Every agentic coding tool in early 2026 (Claude Code, Cursor, Cline, Aider, Continue) discovers capabilities via the Model Context Protocol. No local-inference framework — Ollama, llama.cpp, Unsloth — ships a first-party MCP server. Shipping apr mcp first occupies that slot and makes aprender the default local-inference backend for every MCP-speaking tool.
Competitive context (2026):
- Aprender already: 1.43× Ollama decode perf, 57-command CLI, structured
--json output on most commands
- Missing: the ecosystem hook — no way for agentic tools to invoke aprender programmatically
- Existing contracts already describe the intent:
contracts/mcp-tool-schema-v1.yaml, contracts/pmcp/mcp-protocol-sdk-v1.yaml, contracts/apr-tool-rust-mcp-sdk-v1.yaml. The server itself is unbuilt.
Proposal
New subcommand apr mcp → stdio MCP server exposing 8 Phase-1 tools:
| Tool |
Maps to |
apr.run |
apr run — inference |
apr.serve |
apr serve — OpenAI API server |
apr.qa |
apr qa --json — 8 quality gates |
apr.trace |
apr trace — layer-by-layer analysis |
apr.tensors |
apr tensors --json — tensor inspection |
apr.validate |
apr validate --quality — integrity |
apr.bench |
apr bench — perf benchmarks |
apr.finetune |
apr finetune — LoRA training (streaming) |
Schemas generated from contracts/apr-cli-commands-v1.yaml at build time — no hand-maintained schemas.
New crate: crates/aprender-mcp/ using paiml/rust-mcp-sdk (pmcp).
Full spec: docs/specifications/apr-mcp-server-spec.md
Milestones
Target: M1–M2 in v0.32.0, M3–M4 in v0.33.0.
Falsification Gates
Contract apr-mcp-server-v1.yaml (new) defines 8 falsifiable conditions:
initialize response <500ms
tools/list returns exactly 8 tools with valid JSONSchema Draft 7
apr.run on qwen2.5-0.5b-q4km decodes "2" for prompt "1+1=" within 5s
apr.qa output byte-identical to apr qa --json CLI
- Malformed JSON-RPC returns
-32600, no crash
notifications/cancelled stops decoding ≤30s, returns partial
- Protocol version mismatch rejected at
initialize
- Generated schema byte-identical to contract-derived schema
Success Criteria
- Claude Code / Cursor / Cline configured via
.mcp.json can invoke apr.* tools and receive valid responses
- All 8 falsification gates pass in CI
- Zero regression on existing
apr CLI commands
Non-goals (Phase 1)
resources/*, prompts/*, sampling protocols
- Authentication / multi-tenant
- Windows support (stdio needs separate testing)
- In-process embedded mode (subprocess per call for v1)
Why this over alternatives
Scored against 7 candidate next tasks (declarative YAML training, GRPO, multimodal, torch.compile analog, MXFP4, FUSION-004 default-on, MCP server):
- Perf (FUSION-004, MXFP4): closes 5–10% gap. Benchmark win, not adoption win.
- Training features (GRPO, YAML): expand who CAN use aprender. MCP multiplies the CONTEXTS where aprender is invoked — 10–100× leverage because every agentic coding session becomes a potential invocation.
- torch.compile / multimodal: too large for single-PR scope.
- MCP: lowest code cost (wrap existing CLI), highest reach, unique competitive position.
cc @noahgift
Motivation
Every agentic coding tool in early 2026 (Claude Code, Cursor, Cline, Aider, Continue) discovers capabilities via the Model Context Protocol. No local-inference framework — Ollama, llama.cpp, Unsloth — ships a first-party MCP server. Shipping
apr mcpfirst occupies that slot and makes aprender the default local-inference backend for every MCP-speaking tool.Competitive context (2026):
--jsonoutput on most commandscontracts/mcp-tool-schema-v1.yaml,contracts/pmcp/mcp-protocol-sdk-v1.yaml,contracts/apr-tool-rust-mcp-sdk-v1.yaml. The server itself is unbuilt.Proposal
New subcommand
apr mcp→ stdio MCP server exposing 8 Phase-1 tools:apr.runapr run— inferenceapr.serveapr serve— OpenAI API serverapr.qaapr qa --json— 8 quality gatesapr.traceapr trace— layer-by-layer analysisapr.tensorsapr tensors --json— tensor inspectionapr.validateapr validate --quality— integrityapr.benchapr bench— perf benchmarksapr.finetuneapr finetune— LoRA training (streaming)Schemas generated from
contracts/apr-cli-commands-v1.yamlat build time — no hand-maintained schemas.New crate:
crates/aprender-mcp/usingpaiml/rust-mcp-sdk(pmcp).Full spec:
docs/specifications/apr-mcp-server-spec.mdMilestones
initialize+tools/list(1 week)Target: M1–M2 in v0.32.0, M3–M4 in v0.33.0.
Falsification Gates
Contract
apr-mcp-server-v1.yaml(new) defines 8 falsifiable conditions:initializeresponse <500mstools/listreturns exactly 8 tools with valid JSONSchema Draft 7apr.runon qwen2.5-0.5b-q4km decodes "2" for prompt "1+1=" within 5sapr.qaoutput byte-identical toapr qa --jsonCLI-32600, no crashnotifications/cancelledstops decoding ≤30s, returns partialinitializeSuccess Criteria
.mcp.jsoncan invokeapr.*tools and receive valid responsesaprCLI commandsNon-goals (Phase 1)
resources/*,prompts/*, sampling protocolsWhy this over alternatives
Scored against 7 candidate next tasks (declarative YAML training, GRPO, multimodal, torch.compile analog, MXFP4, FUSION-004 default-on, MCP server):
cc @noahgift