mcp-repo-onboarding — MCP Server for Static Analysis & Onboarding

Note

Phase 9 (Python-Only Release) is now supported in this server (v0.1.0)!

The analyzer generates Python-focused onboarding. Non-Python repos will see a neutral "Python tooling not detected" message. Node.js/TypeScript commands are not generated yet (deferred to Phase 10).

mcp-repo-onboarding — MCP Server for Static Analysis & Onboarding

A professional MCP server for analyzing local Python repositories and generating practical ONBOARDING.md. It detects environment configuration, dependencies, scripts, and frameworks with deterministic, static analysis (no execution, no network).

✨ Features

📊 Repo Structure Analysis: Analyzes local Python repository layout and categorizes files (docs, configs, dependencies).
🧰 Dependency Detection: Detects Python environment files (requirements.txt, pyproject.toml, setup.py, etc.).
⚙️ Command Detection: Extracts developer and test commands from Makefile, tox.ini, shell scripts, and more.
🔍 Framework Detection: Detects popular Python frameworks (Django, Flask, FastAPI, Streamlit, Gradio) from pyproject.toml and requirements.txt.
📝 ONBOARDING.md Generation: Generates validator-compliant onboarding documents with consistent heading order.
📜 Changelog: Track project history in CHANGELOG.md.
🛡️ Safe & Deterministic: Uses static file I/O only (no subprocess, no network, no command execution).
🔧 Other Tooling Reporting: Reports secondary tooling (Node.js, Docker, Go, Rust) neutrally via evidence files (Phase 9).

📋 Prerequisites

Python 3.11+ installed.
uv (recommended for faster installation).
Gemini CLI (or any LLM runtime that supports MCP servers).

🚀 Installation

1. Install Server Locally

Install the server locally using uv (recommended):

git clone https://github.com/rogermt/mcp-repo-onboarding.git
cd mcp-repo-onboarding
uv sync

2. Run Server

Run the server in standalone mode:

uv run mcp-repo-onboarding

3. Install as Gemini CLI Extension (Recommended)

This server is designed to be installed as an extension in Gemini CLI.

# From GitHub
gemini extensions install https://github.com/rogermt/mcp-repo-onboarding

# From local path
gemini extensions install /path/to/mcp-repo-onboarding

4. Activate

Restart the Gemini CLI. The following MCP tools will be available:

/analyze_repo — Analyze local repository and return structured JSON.
/write_onboarding — Write (or overwrite) ONBOARDING.md to the repo root.
/read_onboarding — Read existing ONBOARDING.md content.
/get_run_and_test_commands — Get simplified run/test command suggestions.

💡 Usage

The server provides MCP tools for analyzing repositories and managing onboarding documents.

🎯 Specific Tools (Recommended)

Analyze Repository:

# Analyze current directory (default)
/analyze_repo

# Analyze specific path
/analyze_repo path/to/repo

Generate ONBOARDING.md:

# Write to repo root (default, uses blueprint)
/write_onboarding

# Append to existing file
/write_onboarding mode="append"

# Overwrite existing file
/write_onboarding mode="overwrite"

Read ONBOARDING.md:

# Read from repo root
/read_onboarding

# Read specific path
/read_onboarding path/to/ONBOARDING.md

Get Run & Test Commands:

# Get suggestions
/get_run_and_test_commands

🔧 Advanced Options (Environment Variables)

You can customize behavior by setting environment variables:

MCP_REPO_ONBOARDING_MAX_FILES (default: 5000)
- Maximum number of files to scan in the repo.
MCP_REPO_ONBOARDING_PYTHON_DETECTED (calculated)
- Indicates whether Python was detected (used internally for scope messaging).

🎨 Advanced Analysis Features

Framework Detection

The analyzer detects Python frameworks from pyproject.toml and requirements.txt.

Supported Frameworks:

Django (Classifiers + Poetry deps)
Wagtail (Classifiers)
Flask (Poetry deps)
FastAPI (Poetry deps)
Streamlit (Requirements deps)
Gradio (Requirements deps)

Note: Framework detection is evidence-only and does not infer commands.

Notebook Detection & Capping

The analyzer detects Jupyter notebook directories.

Truncation: The list of notebook directories in ONBOARDING.md is capped to 20 entries to prevent massive lists (e.g., in monorepos).
Note: If truncated, a message like "* notebooks list truncated to 20 entries (total=215)" is added to Analyzer notes.

"Other Tooling Detected" (Phase 9)

Secondary tooling (e.g., Node.js, Docker, Go) is reported neutrally in the Other tooling detected section.

Evidence Files: Lists explicit files (e.g., .nvmrc, package.json).
No Commands: No npm, yarn, pnpm, or docker commands are suggested.
Sorting & Truncation: Evidence lists are sorted alphabetically and capped to 3 files with a deterministic note.

🏗️ Architecture

Analysis Pipeline

File Scanning: Scans repo tree (respects .gitignore, safety ignores).
Categorization: Sorts files into docs, configs, deps, scripts.
Prioritization: Ranks files by importance (root, standards, keywords).
Extraction:
- Scripts: Makefile, tox.ini, shell scripts.
- Frameworks: pyproject.toml, requirements.txt.
- Other Tooling: .nvmrc, package.json, lockfiles.
Bluepring Generation: Compiles sections from a registry (onboarding_blueprint_engine).
Validation: Output is checked against validator rules (V1-V3).

Registry Pattern

The blueprint engine uses a registry pattern (Strategy Pattern) for detectors.

Detectors: PyprojectClassifierDetector, PoetryDependencyDetector, RequirementsDetector.
Extensible: Add new detectors by registering them in DETECTORS.

📁 File Management

Input/Output Locations

Repo Root: The target repository path (default: .).
ONBOARDING.md: Written to the root of the target repository.
Analysis Cache: No persistent cache; analysis is performed on demand.

Safety Ignores

The scanner ignores specific paths to prevent noise and ensure determinism:

tests/fixtures/
test/fixtures/
.git/, .venv/, __pycache__/
node_modules/, site-packages/

🐛 Troubleshooting

Common Issues

"No Python detected" message appears:
- Cause: The repo has no explicit Python evidence files (pyproject.toml, setup.py, requirements.txt) or .py files.
- Fix: This is expected for non-Python repos in Phase 9. The onboarding will still be generated but will state Python-only scope.
"No explicit commands detected" in sections:
- Cause: No Makefile, tox.ini, or shell scripts found with recognizable targets.
- Fix: Add a Makefile or tox.ini with standard targets (test, install, lint).
"No description provided by analyzer" for scripts:
- Cause: Script has no safe header comment.
- Fix: Add a safe header comment (e.g., # Run the test suite) to the top of the script.
ONBOARDING.md validation fails:
- Cause: Generated document does not match validator rules (missing required headings, wrong bullets).
- Fix: This should not happen with Phase 9. If it does, run scripts/validate_onboarding.py locally to debug.

Debug Mode

The server logs detailed warnings to the console:

# Run with debug output
uv run mcp-repo-onboarding

📊 Technical Details

MCP Server Protocol

Protocol: JSON-RPC over stdio
SDK: mcp[cli]>=1.25.0
Tools: analyze_repo, write_onboarding, read_onboarding, get_run_and_test_commands

Analysis Core

Language: Python 3.11+
Dependencies: packaging, pathspec, pydantic, tomllib
File I/O: Standard library (pathlib, os)

Schema (`RepoAnalysis`)

repoPath: Repository root path.
python: Python env info (versions, package managers, deps).
scripts: Grouped scripts (dev, test, lint, format, install).
frameworks: List of detected frameworks.
configurationFiles: Config files (Makefile, workflows, etc.).
docs: Documentation files.
notebooks: Notebook directories.
otherTooling: Secondary tooling (Node.js, Docker, etc.).
notes: Analyzer notes (truncations, etc.).

Blueprint Engine

Module: onboarding_blueprint_engine
Registry: registry.py (list of SectionSpec).
Render Mode: Verbatim (bullet list format).

🤝 Contributing

Fork the repository.
Create a feature branch.
Make your changes (ensure tests pass).
Run uv run pytest and uv run mypy.
Submit a pull request.

📄 Legal

License: Apache License 2.0
Security: Security Policy

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.gemini		.gemini
.github		.github
benchmarks		benchmarks
docs		docs
notes		notes
scripts		scripts
src/mcp_repo_onboarding		src/mcp_repo_onboarding
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
ONBOARDING.md		ONBOARDING.md
README.md		README.md
RELEASE_0_2_0.md		RELEASE_0_2_0.md
coverage.xml		coverage.xml
error.log		error.log
gemini-extension.json		gemini-extension.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock
validation_results.log		validation_results.log

License

rogermt/mcp-repo-onboarding

Folders and files

Latest commit

History

Repository files navigation

mcp-repo-onboarding — MCP Server for Static Analysis & Onboarding

✨ Features

📋 Prerequisites

🚀 Installation

1. Install Server Locally

2. Run Server

3. Install as Gemini CLI Extension (Recommended)

4. Activate

💡 Usage

🎯 Specific Tools (Recommended)

🔧 Advanced Options (Environment Variables)

🎨 Advanced Analysis Features

Framework Detection

Notebook Detection & Capping

"Other Tooling Detected" (Phase 9)

🏗️ Architecture

Analysis Pipeline

Registry Pattern

📁 File Management

Input/Output Locations

Safety Ignores

🐛 Troubleshooting

Common Issues

Debug Mode

📊 Technical Details

MCP Server Protocol

Analysis Core

Schema (RepoAnalysis)

Blueprint Engine

🤝 Contributing

📄 Legal

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Uh oh!

Languages

Schema (`RepoAnalysis`)

Packages