AI-powered research mentor that guides you through your research journey, helping you grow your own skills and judgment. Features a conversational office for open-ended mentoring, a Question Workshop with 10 approaches for developing research questions, and a guided sharing flow for making your research visible. Integrates with academic databases, Retraction Watch, and web search. Runs locally with a browser-based interface and three LLM backends (Claude, local vLLM, or any OpenAI-compatible API).
Developed at Authentic Research Partners by Sergey Samsonau and Olga Vine.
Research is how we advance human knowledge — but getting started is surprisingly hard.
Middle and high school students are curious and capable, yet most have no access to research mentorship. Science fairs scratch the surface, but there's no one to guide a student through forming a real hypothesis, designing a rigorous study, or navigating the iterative mess that actual research is.
College students face a catch-22: labs want prior research experience, but you can't get experience without getting into a lab. Many talented students never break through this gate.
Amateur scientists, citizen scientists, and professionals sometimes want to explore a research question on the side — for intellectual fulfillment, to contribute to a field they care about, or just for fun. But without institutional structure, they don't know where to start or how to stay rigorous.
Personal Research Mentor is an AI research mentor that fills this gap. It doesn't do your research for you — it teaches you how to do research. It asks questions, challenges your thinking, helps you design experiments, and adapts to where you are in the process. Think of it as a patient, always-available advisor who meets you at your level.
Runs locally as a single Python process with SQLite for storage. Three LLM backends: Claude Code CLI (default — flat-fee subscription, no per-call API costs, conversations processed by Anthropic), a local vLLM server (fully offline, nothing leaves your machine, requires GPU), or a remote API (bring your own key).
Free for individual use. Not designed for commercial use. If you want a similar solution for your organization, contact Authentic Research Partners.
- Adaptive research mentoring — 21-node LangGraph agent that guides students through the research process
- Question Workshop — 9 pipelines + Hypothesis chat for developing research questions
- Research sharing — guided publication flow
- Semantic memory — remembers context across sessions within each project
- Assessment tracking — project-level and student-level progress evaluation
- Multiple teaching personas — different mentoring styles
- Content safety filtering — input and output safety review
- Artifact processing — upload papers, documents, data files, photographs, scientific images, charts, and handwritten notes
- Data analysis — automatic statistical analysis of uploaded data, adapted to student level, with pedagogical display of results
- Vision — optional image understanding for describing photographs, scientific images, charts, and handwriting
- Three LLM backends — Claude Code CLI, local vLLM, remote API
- Browser-based interface built with React — conversational office, question workshop, research sharing, project management, progress tracking, usage monitoring, and settings
- vLLM container management — start, stop, restart, status, health checks
- API usage budget tracking
- Tool usage tracking with real-time activity reporting
- CLI — serve, init, doctor, reset, export, migrate, vllm-container
- Academic search — OpenAlex, Semantic Scholar, PubMed, arXiv, Europe PMC
- Researcher and institution discovery — OpenAlex, ORCID, ROR
- Paper fetching — introduction extraction from 8 open-access sources
- Retraction Watch — local semantic search over retracted papers
- Web search — Brave Search or Tavily
- Open PowerShell and run:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" - Close and reopen PowerShell, then:
uv tool install research-mentor --python 3.13
curl -LsSf https://astral.sh/uv/install.sh | shuv tool install research-mentor --python 3.13
See INSTALLATION.md for detailed step-by-step instructions, including uninstall.
research-mentor doctor # check system dependencies
research-mentor # start the server (creates database on first run)To use local vLLM instead of Claude Code CLI (requires Podman + NVIDIA GPU):
research-mentor vllm-container start # start vLLM in a Podman container
research-mentor serve --backend vllm # start the app servervLLM runs in a container (Podman or Docker) — it is not a Python dependency of this package. The container handles all vLLM + CUDA dependencies. See docs/usage.md for details.
Once the server is running:
# Health check
curl localhost:8080/api/health
# Talk to the mentor
curl -X POST localhost:8080/api/office \
-H "Content-Type: application/json" \
-d '{"message": "How do I design a water filtration experiment?"}'
# Interactive API docs
open http://localhost:8080/api/docsDefault config is bundled. Override in ~/.research-mentor/config.toml:
[llm]
backend = "vllm"
[llm.vllm]
# Switch models by changing active_model (profiles defined in bundled config):
# "gemma3-12b-fp8" — Gemma3-12B, non-thinking (default)
# "qwen3-8b-fp8" — Qwen3-8B, thinking (<think> tags)
active_model = "gemma3-12b-fp8"All options can also be set via CLI flags (research-mentor vllm-server --help) or environment variables (RESEARCH_MENTOR_BACKEND=vllm).
See docs/configuration.md for the full configuration reference.
Research Mentor runs as a local server on your machine (localhost only). There is no cloud backend, no user accounts, and no telemetry.
- Your conversations — stored in a local SQLite database (
~/.research-mentor/) - Uploaded files — stored locally under
~/.research-mentor/artifacts/ - PDF content — sent only to a local GROBID container for text extraction, never to external services
- With the
vllmbackend — all LLM inference happens locally on your GPU. Nothing leaves your machine
| Data | Where | Why |
|---|---|---|
| Paper titles, DOIs, author names | OpenAlex, Semantic Scholar, PubMed, Europe PMC | Literature search |
| Search queries | Brave Search or Tavily (if configured) | Web search tool |
| Researcher names | ORCID | Researcher lookup |
| Your email (if configured) | OpenAlex, Unpaywall | Polite API access (higher rate limits) |
| API keys | Respective services (over HTTPS) | Authentication |
All external requests use HTTPS (encrypted in transit), but the service providers can see your queries. If you search for papers about a sensitive topic, the API providers (OpenAlex, Semantic Scholar, PubMed, Brave, etc.) will see that query and the DOIs/titles you look up. This is the same as using these services directly in a browser — but worth knowing if your research topic is sensitive.
With the claude_cli backend (default), your entire conversation — every message you send and every response the mentor generates — is processed by Anthropic via the Claude Code CLI. This includes your research topic, hypotheses, experimental designs, and any personal details you share. Subject to Anthropic's privacy policy.
With the api backend, the same conversation content is sent to whichever API endpoint you configure (e.g., OpenAI, a hosted model provider). The provider can see everything discussed in the session.
With the vllm backend pointing to a local GPU, LLM inference runs entirely on your machine — no conversation content leaves it. This is the most private option. If you point the vLLM backend at a remote server (e.g., a shared GPU server in your organization), conversation content is sent to that server instead.
In all cases, search tool queries (see above) are still sent to external academic APIs regardless of which LLM backend you use.
The LLM never executes code — its output is either validated through structured JSON parsing or rendered as plain text.
However, the LLM does formulate search queries that are sent to external APIs. When you ask about a research topic, the system searches for relevant papers and web resources on your behalf. The search queries are derived from your conversation — for example, if you're researching water filtration, the system sends queries like "water filtration efficiency" to OpenAlex or Brave Search.
This means your research topic and related terms are sent to external search services as part of normal operation. No system secrets (API keys, configuration, file paths) are ever included in the LLM's context or in search queries. All tool calls are rate-limited and logged.
Every outgoing search query is checked by a two-layer safety reviewer before it leaves your machine:
- Pattern-based check (instant) — blocks queries containing emails, phone numbers, file paths, or credentials
- AI-based check — catches subtler leaks like a student's name combined with their school
Both layers are on by default. If a query is blocked, the search is skipped and the mentor continues without those results. Power-mode users can adjust these in Settings.
- API keys are stored in individual files with owner-only permissions (
chmod 600) - SQL injection protection — parameterized queries with column allowlists
- File upload security — filename sanitization, file type whitelist, size limits
- XML parsing — uses
defusedxmlto prevent entity expansion attacks - Download size caps — PDF downloads (100 MB) and data imports (500 MB decompressed) are capped to prevent memory exhaustion
- Prompt injection protection — uploaded documents and user inputs are wrapped in content boundary delimiters before being included in LLM prompts, preventing embedded text from being interpreted as instructions
- Security headers — Content-Security-Policy, X-Frame-Options, X-Content-Type-Options on all responses
- Automated security scanning — every commit is checked with bandit (static analysis), pip-audit (known CVEs), and pip-licenses (license compliance)
Research Mentor binds to localhost and has no authentication. Do not expose it to untrusted networks without adding your own authentication layer.
This is an opinionated project. We don't accept unsolicited pull requests — please open an issue or reach out before contributing. See CONTRIBUTING.md for details.
This is an AI-powered educational tool. Like all AI systems, it can produce inaccurate or misleading information. It provides research guidance but does not replace your own critical thinking and judgment. Always verify AI-generated content with authoritative sources. If you are a minor, please involve a parent or other trusted adult when anything seems questionable, unclear, or unsafe.
PolyForm Noncommercial 1.0.0 — free for personal and noncommercial use. Provided as-is, with no warranty or liability. Not designed for commercial use. If you want a similar solution for your organization, contact Authentic Research Partners.