Skip to content

mateffy/research.actor

Repository files navigation

research.actor Logo

research.actor Logo research.actor

Every new agent session wastes tokens re-reading your codebase.
With research.actor you run a full research agent once per git commit, cache its analysis, and return it instantly to other agents. Saves time and tokens. Used as a baseline for deeper research.

Website | Documentation



npm install -g research.actor
$ cd /path/to/your/codebase

$ research analyze
{
  "analysis": "Express API with 12 routes...",
  "fromCache": true,
  "gitHash": "abc123"
}
or with specific questions...
research ask "explain the auth flow" --harness opencode

Install the skill to teach your AI agents about research:

# Print the skill content (default)
research skill

# Install to detected agents in current repo
research skill --install

# Install for specific agents (comma-separated)
research skill --install --harness claude,opencode

# Install globally (requires explicit --harness)
research skill --install --global --harness claude

# Uninstall/remove the skill
research skill --uninstall

Works with Claude, OpenCode, Codex, Aider, and Gemini — the skill teaches agents when and how to use research effectively.


Contents


Why?

Every new agent session wastes tokens re-exploring your entire codebase to understand it. research breaks this bottleneck:

first call on a commit  →  full analysis  →  cached to disk
subsequent calls        →  instant cache hit
with working changes    →  cache hit + fast diff pass on top

Installation

Global CLI (recommended)

npm install -g research
# or
bun add -g research

SDK only

npm install @research-agent/core
# or
bun add @research-agent/core

CLI only

npm install -g @research-agent/cli
# or
bun add -g @research-agent/cli

How it works

first call on a commit  →  full analysis  →  cached to disk
subsequent calls        →  instant cache hit
with working changes    →  cache hit + fast diff pass on top

Cache key

Each analysis is uniquely identified by a combination of:

  • Git commit hash — Each commit gets its own cache entry
  • System prompt hash — Different analysis focuses create separate cache entries
  • Project key — Derived from the repository path

Cache files are stored in ~/.cache/research/<project-key>/ — outside your repository so agents don't accidentally read them. The XDG_CACHE_HOME environment variable is respected.

Two-phase analysis

Phase 1 — Base Analysis (cached)

When you run research analyze on a clean git state, it:

  1. Checks if a cached analysis exists for the current commit + system prompt
  2. If cache hit: Returns instantly
  3. If cache miss: Invokes your AI harness to explore the codebase, then caches the result

Phase 2 — Working Changes (never cached)

If you have uncommitted changes, research runs a lightweight second pass that:

  1. Provides the cached base analysis to the agent
  2. Asks the agent to discover and describe working changes organically
  3. Returns the combined context

This means targeted questions (--prompt) always get fresh answers about your current work, while the expensive base analysis is reused.

Branch switching

Switch branches or commits and the correct cache entry loads automatically based on git hash. Each branch/commit combination maintains its own cached analysis.


CLI usage

# Basic — auto-detects installed harness, uses persistent cache
research analyze

# Targeted question about the current working diff (never cached)
research analyze --prompt "what auth changes are in progress?"

# Customize the analysis focus — stored as a separate cache entry
research analyze --system-prompt "focus on the API layer and data models"

# Specify harness and model
research analyze --harness claude --model claude-opus-4

# Force a fresh analysis even if a cache entry exists
research analyze --force

# Only use cached entry if younger than a given duration
research analyze --max-age 2h
research analyze --max-age 30m
research analyze --max-age 7d

# JSON output — useful when consuming from another script or agent tool
research analyze --json

# List harnesses detected on this system
research analyze --list-harnesses

# Remove all cached analyses for the current repository
research clear

research analyze [flags] is the explicit command for running analysis.

research ask — Ask specific questions

Ask a targeted question about the codebase. Uses cached analysis as context if available, otherwise the agent will analyze on-demand:

# Ask a question about the codebase
research ask "explain the authentication flow"

# Ask with specific harness
research ask "which files handle user sessions?" --harness claude

# JSON output for programmatic use
research ask "what's the database schema?" --json

Key differences from research --prompt:

  • research ask focuses on answering your specific question (not generating a general analysis)
  • Uses cached base analysis as context if available
  • Never caches the answer itself
  • Agent is instructed to be concise and answer the question directly

Flags

Flag Short Description
--harness <name> -H Harness to use. One of: opencode, claude, codex, aider, gemini. Auto-detected if omitted.
--model <name> -m Model to pass to the harness (e.g. claude-opus-4, gpt-4o).
--system-prompt <text> -s Appended to the base analysis prompt. Different values produce separate cache entries.
--prompt <text> -p Passed only to the working-changes agent. Never cached.
--force -f Bypass cache and re-run the full analysis.
--max-age <duration> -a Maximum cache age. Accepts ms, s, m, h, d — e.g. 30m, 2h, 7d.
--json -j Emit a JSON object instead of plain text.
--list-harnesses -l Print detected harnesses and exit.

Supported harnesses

Name Binary
OpenCode opencode
Claude Code claude
OpenAI Codex codex
Aider aider
Gemini CLI gemini

Cache location

The CLI stores cache files in ~/.cache/research/<project-key>/ — outside the repository, so agents do not accidentally read them. The XDG_CACHE_HOME environment variable is respected.

research skill — Manage the agent skill

Print, install, or uninstall the research skill to teach your AI agents how to use research effectively:

# Print the SKILL.md content (default)
research skill

# Pipe to a file
research skill > my-skill.md

# Install to detected agents in current repo
research skill --install

# Install for specific agents (comma-separated)
research skill --install --harness claude,opencode,codex

# Install globally (system-wide) — requires explicit --harness
research skill --install --global --harness claude

# Uninstall/remove the skill
research skill --uninstall

# Uninstall from specific agents
research skill --uninstall --harness claude,opencode

# Uninstall globally
research skill --uninstall --global --harness claude

Where skills are installed:

Harness Local (project) Global (system)
Claude Code .claude/skills/research.actor/SKILL.md ~/.claude/CLAUDE.md
OpenCode .opencode/skills/research.actor/SKILL.md ~/.config/opencode/skills/research.actor/SKILL.md
Codex .agents/skills/research.actor/SKILL.md ~/.codex/agents/skills/research.actor/SKILL.md
Aider Appends to CONVENTIONS.md ~/.aider/conventions/research.md
Gemini .gemini/skills/research.actor/SKILL.md ~/.gemini/skills/research.actor/SKILL.md

SDK usage

research exports the full SDK. Everything is available from the top-level import.

Basic

The simplest case. Uses a fresh MemoryStore (no disk I/O) and auto-detects the first available harness.

import { analyze } from "research.actor"

const result = await analyze()

console.log(result.analysis)   // the full codebase analysis text
console.log(result.fromCache)  // true if served from cache
console.log(result.gitHash)    // the commit the analysis is keyed to
console.log(result.runner)     // name of the runner that produced it

With a focused prompt layered on top:

const result = await analyze({
  prompt: "what does the authentication flow look like?",
})

Persistent cache with FsStore

The MemoryStore default does not survive across process restarts. For persistent caching — the same behaviour as the CLI — pass an FsStore:

import { analyze, FsStore } from "research.actor"

const result = await analyze({
  store: new FsStore(),
})

FsStore defaults to ~/.cache/research/. Pass a custom directory if needed:

const store = new FsStore("/var/cache/myapp/research")
const result = await analyze({ store })

For long-lived processes (e.g. a server), create the store once and reuse it across calls so repeated calls within the same process also benefit from the in-memory lookup before touching disk:

const store = new FsStore()

// first call — may hit disk or run the harness
await analyze({ store })

// second call in same process — hits the in-memory layer first
await analyze({ store })

Custom cache store

Implement CacheStore to persist analyses anywhere — a database, Redis, S3, etc.

import { analyze } from "research.actor"
import type { CacheStore, CacheKey, AnalysisCache } from "research.actor"

class PostgresStore implements CacheStore {
  async get(key: CacheKey): Promise<AnalysisCache | null> {
    const row = await db.query(
      "SELECT data FROM analyses WHERE project=$1 AND hash=$2 AND prompt_hash=$3",
      [key.projectKey, key.gitHash, key.systemPromptHash ?? null],
    )
    return row ?? null
  }

  async set(key: CacheKey, entry: AnalysisCache): Promise<void> {
    await db.query(
      "INSERT INTO analyses (project, hash, prompt_hash, data) VALUES ($1,$2,$3,$4) ON CONFLICT DO UPDATE SET data=$4",
      [key.projectKey, key.gitHash, key.systemPromptHash ?? null, entry],
    )
  }

  async delete(key: CacheKey): Promise<void> {
    await db.query(
      "DELETE FROM analyses WHERE project=$1 AND hash=$2 AND prompt_hash=$3",
      [key.projectKey, key.gitHash, key.systemPromptHash ?? null],
    )
  }
}

const result = await analyze({ store: new PostgresStore() })

Custom runner (in-process agent)

By default research spawns a subprocess harness. Implement HarnessRunner to use any agent instead — an in-process library, a remote API call, a local model, or a test mock.

import { analyze } from "research.actor"
import type { HarnessRunner, RunRequest, RunResult } from "research.actor"

// Example: in-process agent (e.g. pi, or your own)
class MyAgentRunner implements HarnessRunner {
  readonly name = "my-agent"

  async run(req: RunRequest): Promise<RunResult> {
    const output = await myAgent.query(req.prompt, {
      workingDir: req.cwd,
      model: req.model,
    })
    return { output }
  }
}

const result = await analyze({
  runner: new MyAgentRunner(),
  store: new FsStore(),
})

When runner is provided, harness auto-detection is skipped entirely. The model option is still forwarded to the runner via RunRequest.model if you want to use it.

For testing, a mock runner removes all I/O:

class MockRunner implements HarnessRunner {
  readonly name = "mock"
  async run(_req: RunRequest): Promise<RunResult> {
    return { output: "src/ contains the main app. index.ts is the entry point." }
  }
}

const result = await analyze({ runner: new MockRunner() })
// result.fromCache === false, result.runner === "mock"

You can also wrap the built-in SubprocessRunner to intercept or modify behaviour:

import { SubprocessRunner, resolveHarness } from "research.actor"
import type { HarnessRunner, RunRequest, RunResult } from "research.actor"

class LoggingRunner implements HarnessRunner {
  private readonly inner: SubprocessRunner

  constructor(inner: SubprocessRunner) {
    this.inner = inner
  }

  get name() { return this.inner.name }

  async run(req: RunRequest): Promise<RunResult> {
    console.log(`[research] running ${this.name} in ${req.cwd}`)
    const result = await this.inner.run(req)
    console.log(`[research] got ${result.output.length} chars`)
    return result
  }
}

const harness = await resolveHarness("claude")
const runner = new LoggingRunner(new SubprocessRunner(harness))
await analyze({ runner, store: new FsStore() })

Cache expiry with maxAge

Pass maxAge in milliseconds to treat entries older than that as stale:

import { analyze, FsStore } from "research.actor"

// Re-run if cached analysis is older than 24 hours
await analyze({
  store: new FsStore(),
  maxAge: 24 * 60 * 60 * 1000,
})

// Re-run if older than 30 minutes
await analyze({
  store: new FsStore(),
  maxAge: 30 * 60 * 1000,
})

A stale entry is treated as a cache miss. The new result overwrites the old one in the store. When maxAge is omitted, entries never expire (only force: true bypasses them).

Error handling

All research errors extend CachelyzError:

import { analyze, FsStore, CachelyzError, HarnessNotFoundError, GitError } from "research.actor"

try {
  await analyze({ store: new FsStore() })
} catch (err) {
  if (err instanceof HarnessNotFoundError) {
    // No supported harness binary found in PATH
    console.error("Install a harness: opencode, claude, codex, aider, or gemini")
  } else if (err instanceof GitError) {
    // Not a git repo, or git is not installed
    console.error("research must be run inside a git repository")
  } else if (err instanceof CachelyzError) {
    // Any other research error
    console.error(err.message, err.cause)
  } else {
    throw err
  }
}

API reference

analyze(opts?): Promise<AnalyzeResult>

The main entry point. All options are optional.

interface AnalyzeOptions {
  harness?:      HarnessName      // subprocess harness to use (auto-detected if omitted)
  model?:        string           // model name forwarded to the runner
  systemPrompt?: string           // appended to the cached analysis prompt
  prompt?:       string           // passed only to the working-changes agent, never cached
  cwd?:          string           // working directory (defaults to process.cwd())
  force?:        boolean          // bypass cache entirely
  maxAge?:       number           // ms — treat cache entries older than this as stale
  store?:        CacheStore       // storage backend (defaults to MemoryStore)
  runner?:       HarnessRunner    // execution backend (defaults to SubprocessRunner)
}

interface AnalyzeResult {
  analysis:   string   // full analysis text
  fromCache:  boolean  // true if base analysis was served from cache
  gitHash:    string   // commit hash the analysis is keyed to
  projectKey: string   // stable repo identifier used in cache keys
  runner:     string   // name of the runner that produced the base analysis
}

class FsStore

Filesystem-backed CacheStore. Stores entries as JSON under ~/.cache/research/.

new FsStore(baseDir?: string)

class MemoryStore

In-memory CacheStore. Default when no store is passed to analyze(). Entries do not survive process restarts.

const store = new MemoryStore()
store.size   // number of entries currently held
store.clear() // remove all entries

class SubprocessRunner

Default HarnessRunner. Spawns a harness binary as a child process and streams stdout.

import { SubprocessRunner, resolveHarness } from "research.actor"

const harness = await resolveHarness("claude")
const runner = new SubprocessRunner(harness)

FsStore.clearProject(projectKey): Promise<number>

Remove all cached entries for a project. Returns the number of files deleted.

import { FsStore, deriveProjectKey, getRepoRoot } from "research.actor"

const repoRoot = await getRepoRoot(process.cwd())
const projectKey = deriveProjectKey(repoRoot)
const store = new FsStore()
const deleted = await store.clearProject(projectKey)
console.log(`Deleted ${deleted} cache entries`)

resolveHarness(name?): Promise<ResolvedHarness>

Resolve a harness binary path. Picks the first installed harness when name is omitted. Throws HarnessNotFoundError if nothing is found.

detectHarnesses(): Promise<ResolvedHarness[]>

Return all harnesses found in PATH.

getGitInfo(cwd): Promise<GitInfo>

Return the current commit hash, repo root, and whether there are uncommitted changes.

Interfaces

interface HarnessRunner {
  name: string
  run(req: RunRequest): Promise<RunResult>
}

interface RunRequest {
  prompt: string
  cwd:    string
  model?: string
}

interface RunResult {
  output: string
}

interface CacheStore {
  get(key: CacheKey):                       Promise<AnalysisCache | null>
  set(key: CacheKey, entry: AnalysisCache): Promise<void>
  delete(key: CacheKey):                    Promise<void>
}

interface CacheKey {
  projectKey:       string
  gitHash:          string
  systemPromptHash?: string
}

Package structure

Package Description
research Full package — SDK + CLI. Start here.
@research-agent/core SDK only. No CLI dependency.
@research-agent/cli CLI only. Depends on @research-agent/core.
@research-agent/skill Agent skill for teaching agents to use research.

Agent skill

A "skill" is a teaching resource for AI agents. When an AI agent has access to this skill, it can more effectively use research to analyze codebases.

View the skill

Print the skill content to see what's included:

research skill

Install the skill

The easiest way to install is via the CLI:

research skill --install

This auto-detects your installed agents and installs the skill locally in the current repository.

Manual Installation

If you prefer, you can also install the skill package via npm:

npm install @research-agent/skill
# or
bun add @research-agent/skill

What's in the skill?

  • Usage patterns — When and how to use research effectively
  • Integration guides — Working with different harnesses (Claude, OpenCode, Codex, etc.)
  • Best practices — Common pitfalls and how to avoid them
  • Troubleshooting — Common issues and solutions

For AI Agents

Once the skill is installed, AI agents can reference it:

Use the research skill to analyze this codebase.

Features

  • Instant context for AI agents after first analysis
  • 🔄 One analysis per commit — cached and reused
  • 🔍 Organic diff analysis via git tools
  • 🤖 Multiple harness support (opencode, claude, codex, aider, gemini)
  • 💾 Pluggable cache stores (filesystem, memory, or custom)
  • 📦 Full TypeScript SDK
  • ⏱️ Cache expiry with maxAge option
  • 🛡️ Outside repo cache (stored in ~/.cache/research/)

License

MIT

About

Cacheable codebase exploration for AI coding agents. Every new agent session wastes tokens re-reading your codebase. With research.actor you run a full research agent once per git commit, cache its analysis, and return it instantly to other agents. Saves time and tokens.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors