fallow audit takes ~30s on a 17k-file / 64-workspace synthetic monorepo with many framework plugins

### What happened?

On a monorepo that combines several common edge cases (many workspaces, `apps/**` nesting, many framework plugins per workspace, large barrel `index.ts` files, no `node_modules` present), cold-cache runs of `fallow`, `fallow dead-code`, `fallow health`, and `fallow audit --base <ref>` take **~20–30 seconds**. To make this independently reproducible and easy to profile, I built a synthetic monorepo generator (200-line Node script, inlined below) that exercises the same edge cases and produces the same wall-clock on a clean machine.

Timings (Apple Silicon M3 Max, fallow 2.52.0, `--no-cache`, no `node_modules`):

| invocation | wall-clock | dominant stage(s) |
|---|---:|---|
| `fallow dead-code` | **9.4 s** | `analyze` 3.1 s · `resolve imports` 2.9 s · `plugins` 1.4 s · `workspaces` 1.0 s |
| `fallow dupes` (strict) | **1.5 s** | suffix-array; **no `--performance` block emitted** |
| `fallow health` (default) | **11.9 s** | `duplication` 1.1 s · `git churn` 0.5 s · `parse` 0.5 s · `discover` 0.5 s |
| `fallow audit --base HEAD` (zero changes) | **~1.1 s** | short-circuits ✓ |
| `fallow audit --base <initial-commit>` (full diff) | **28.2 s** | full `dead-code + dupes + health` pipeline |
| `fallow` bare (combined) | **12.2 s** | sum of `dead-code + dupes + health` |

`fallow --performance` block:

```
┌─ Pipeline Performance (dead-code) ──────────────────
│  discover files:      239.5ms  (16896 files)
│  workspaces:         1003.1ms  (64 workspaces)
│  plugins:            1360.0ms
│  script analysis:       0.9ms
│  parse/extract:       340.3ms  (16896 modules)
│  cache update:          0.0ms
│  entry points:        240.1ms  (448 entries)
│  resolve imports:    2905.8ms
│  build graph:          20.4ms
│  analyze:            3107.1ms
│  TOTAL:              9224.2ms
└─────────────────────────────────────────────────────

┌─ Health Pipeline Performance ───────────────────────
│  discover files:      466.7ms          ← repeated work; not shared with dead-code
│  parse/extract:       465.3ms          ← repeated work; not shared with dead-code
│  complexity:            4.8ms
│  file scores:          69.5ms
│  git churn:           488.9ms (cold)
│  hotspots:              4.4ms
│  duplication:        1059.8ms
│  TOTAL:             10896.6ms
└─────────────────────────────────────────────────────
```

### Edge cases the reproduction stresses

| edge case | generator flag | what it stresses |
|---|---|---|
| 64 workspaces | `--workspaces 64` | `workspaces` stage scaling |
| 256 files per ws (~17k total) | `--files-per-ws 256` | `discover` + `parse` + `analyze` scaling |
| 8 framework plugins per ws | `--plugins next,gatsby,remix,vite,vitest,webpack,parcel,rollup` | `plugins` stage scaling |
| Per-ws barrel `index.ts` re-exporting all 256 modules | (built-in) | `entry points` + `analyze` super-linear behaviour |
| 5 intra-ws + 5 cross-ws imports per file | `--imports-per-file 5 --cross-ws-imports 5` | `resolve imports` + cross-workspace graph cost |
| Nested file layout (`src/g0/g1/g2/`) | `--nest-depth 3` | dir-walking heuristics |
| `apps/**` workspace pattern | `--workspace-pattern 'apps/**' --workspace-parent apps` | recursive workspace walker |
| 200-commit git history | `--commits 200` | `git churn` / `--hotspots` cost |

### Reproduction


### Prerequisites

- macOS or Linux
- Node 18+ (`v20.20.0` was used)
- Bash, Git
- `fallow` 2.52.0 (free tier, no license needed)
- ~500 MB free disk
- ~1 minute setup time

### Step 1 — create a clean directory

```bash
mkdir /tmp/fallow-perf-repro && cd /tmp/fallow-perf-repro
```

### Step 2 — save the generator (200 lines, ~9 KB)

````bash
cat > gen-monorepo.cjs <<'GEN_EOF'
#!/usr/bin/env node
/**
 * Synthetic monorepo generator for fallow performance reproductions.
 *
 * Usage:
 *   node gen-monorepo.cjs --workspaces 80 --files-per-ws 325 --root .
 *
 * All flags are documented in the inline comments below.
 */
const fs = require('node:fs');
const path = require('node:path');
const cp = require('node:child_process');

function arg(name, def) {
  const i = process.argv.indexOf(name);
  if (i === -1) return def;
  const v = process.argv[i + 1];
  return v === undefined ? true : v;
}

const N_WS = parseInt(arg('--workspaces', '20'), 10);
const FILES_PER_WS = parseInt(arg('--files-per-ws', '50'), 10);
const ROOT = path.resolve(arg('--root', '.'));
const IMPORTS_PER_FILE = parseInt(arg('--imports-per-file', '3'), 10);
const CROSS_WS = parseInt(arg('--cross-ws-imports', '0'), 10);
const COMMITS = parseInt(arg('--commits', '0'), 10);
const WRITE_CONFIGS = arg('--no-config-files', false) !== true;
const NEST_DEPTH = parseInt(arg('--nest-depth', '0'), 10);

const DEFAULT_PLUGINS = ['typescript', '@nx/workspace', 'eslint'];
const KITCHEN_SINK_PLUGINS = [
  'vite', 'vitest', 'jest', '@storybook/react', '@nx/workspace',
  '@angular/core', 'next', 'playwright', 'cypress', 'eslint',
  'tailwindcss', 'react', 'react-router', 'remix', 'gatsby',
  'nuxt', 'astro', 'rollup', 'webpack', 'parcel'
];
let pluginArg = arg('--plugins', DEFAULT_PLUGINS.join(','));
if (pluginArg === 'kitchen-sink') pluginArg = KITCHEN_SINK_PLUGINS.join(',');
const PLUGINS = String(pluginArg).split(',').filter(Boolean);

const PLUGIN_CONFIGS = {
  vite:        { file: 'vite.config.ts',          body: 'export default { plugins: [] };\n' },
  vitest:      { file: 'vitest.config.ts',        body: 'export default { test: { globals: true } };\n' },
  jest:        { file: 'jest.config.js',          body: 'module.exports = { testEnvironment: "node" };\n' },
  '@storybook/react': { file: '.storybook/main.ts', body: 'export default { stories: ["../src/**/*.stories.tsx"] };\n' },
  '@nx/workspace': { file: 'project.json',        body: '{ "name": "PKG", "sourceRoot": "src" }\n' },
  '@angular/core': { file: 'angular.json',        body: '{ "version": 1, "projects": {} }\n' },
  next:        { file: 'next.config.js',          body: 'module.exports = {};\n' },
  playwright:  { file: 'playwright.config.ts',    body: 'export default { testDir: "./e2e" };\n' },
  cypress:     { file: 'cypress.config.ts',       body: 'export default { e2e: { baseUrl: "http://localhost" } };\n' },
  eslint:      { file: '.eslintrc.cjs',           body: 'module.exports = { root: false, rules: {} };\n' },
  tailwindcss: { file: 'tailwind.config.js',      body: 'module.exports = { content: ["./src/**/*.{ts,tsx}"] };\n' },
  react:       null,
  'react-router': null,
  remix:       { file: 'remix.config.js',         body: 'module.exports = { ignoredRouteFiles: ["**/.*"] };\n' },
  gatsby:      { file: 'gatsby-config.ts',        body: 'export default { plugins: [] };\n' },
  nuxt:        { file: 'nuxt.config.ts',          body: 'export default {};\n' },
  astro:       { file: 'astro.config.mjs',        body: 'export default {};\n' },
  rollup:      { file: 'rollup.config.js',        body: 'export default { input: "src/index.ts", output: { file: "dist/index.js" } };\n' },
  webpack:     { file: 'webpack.config.js',       body: 'module.exports = { entry: "./src/index.ts" };\n' },
  parcel:      { file: '.parcelrc',               body: '{ "extends": "@parcel/config-default" }\n' }
};

function mkdir(p) { fs.mkdirSync(p, { recursive: true }); }
function write(p, body) { mkdir(path.dirname(p)); fs.writeFileSync(p, body); }

// Pinned, well-known stable versions so the maintainer can run npm install
// safely against the public npm registry. Pinning avoids any risk of pulling
// a pre-release with surprising postinstall scripts.
const PLUGIN_VERSIONS = {
  vite: '5.4.10', vitest: '1.6.1', jest: '29.7.0',
  '@storybook/react': '8.4.0', '@nx/workspace': '20.0.0', '@angular/core': '18.2.0',
  next: '14.2.18', playwright: '1.48.0', cypress: '13.15.0',
  eslint: '9.13.0', tailwindcss: '3.4.14', react: '18.3.1', 'react-router': '6.27.0',
  remix: '1.19.3', gatsby: '5.13.7', nuxt: '3.13.2', astro: '4.16.0',
  rollup: '4.24.0', webpack: '5.95.0', parcel: '2.12.0',
  mocha: '10.7.3', ava: '6.1.3', typescript: '5.6.3',
};

function workspacePackage(i) {
  const deps = {};
  for (const p of PLUGINS) deps[p] = PLUGIN_VERSIONS[p] || '0.0.0';
  return {
    name: `@repro/pkg-${String(i).padStart(3, '0')}`,
    version: '0.0.0', private: true,
    main: 'src/index.ts', types: 'src/index.ts',
    dependencies: deps
  };
}

function fileBody(wsIdx, fileIdx) {
  const lines = [];
  for (let k = 1; k <= IMPORTS_PER_FILE; k++) {
    const target = (fileIdx + k) % FILES_PER_WS;
    if (target === fileIdx) continue;
    lines.push(`import { fn${target} } from "./mod-${target}";`);
  }
  for (let k = 0; k < CROSS_WS && N_WS > 1; k++) {
    const otherWs = (wsIdx + k + 1) % N_WS;
    lines.push(`import { fn0 as cross${k} } from "@repro/pkg-${String(otherWs).padStart(3, '0')}";`);
  }
  lines.push('');
  lines.push(`export function fn${fileIdx}(input: number): number {`);
  lines.push('  let acc = input;');
  lines.push(`  if (acc > 10) acc += 1; else acc -= 1;`);
  lines.push(`  for (let i = 0; i < 3; i++) acc *= 2;`);
  lines.push('  return acc;');
  lines.push('}');
  lines.push(`export const CONST_${fileIdx} = ${fileIdx};`);
  return lines.join('\n') + '\n';
}

function fileSubdir(fileIdx) {
  if (NEST_DEPTH <= 0) return 'src';
  const segs = ['src'];
  let n = fileIdx;
  for (let d = 0; d < NEST_DEPTH; d++) {
    segs.push(`g${n % 5}`);
    n = Math.floor(n / 5);
  }
  return segs.join('/');
}

function indexBody(wsIdx) {
  const lines = [];
  for (let i = 0; i < FILES_PER_WS; i++) lines.push(`export * from "./mod-${i}";`);
  return lines.join('\n') + '\n';
}

function writeConfigs(wsRoot) {
  if (!WRITE_CONFIGS) return;
  for (const p of PLUGINS) {
    const cfg = PLUGIN_CONFIGS[p];
    if (!cfg) continue;
    write(path.join(wsRoot, cfg.file), cfg.body.replace(/PKG/g, path.basename(wsRoot)));
  }
}

const WS_PARENT = String(arg('--workspace-parent', 'packages'));
const WORKSPACE_PATTERN = String(arg('--workspace-pattern', `${WS_PARENT}/*`));

console.log(`[gen] root=${ROOT} workspaces=${N_WS} files/ws=${FILES_PER_WS} plugins=${PLUGINS.length} pattern=${WORKSPACE_PATTERN} nest=${NEST_DEPTH}`);
mkdir(ROOT);
write(path.join(ROOT, 'package.json'), JSON.stringify({
  name: 'fallow-perf-repro', version: '0.0.0', private: true,
  workspaces: [WORKSPACE_PATTERN],
  devDependencies: { typescript: '5.0.0' }
}, null, 2) + '\n');
write(path.join(ROOT, 'tsconfig.json'), JSON.stringify({
  compilerOptions: { target: 'ES2022', module: 'ESNext', moduleResolution: 'Bundler', strict: false, baseUrl: '.', paths: {} }
}, null, 2) + '\n');

for (let w = 0; w < N_WS; w++) {
  const wsRoot = path.join(ROOT, WS_PARENT, `pkg-${String(w).padStart(3, '0')}`);
  write(path.join(wsRoot, 'package.json'), JSON.stringify(workspacePackage(w), null, 2) + '\n');
  write(path.join(wsRoot, 'src', 'index.ts'), indexBody(w));
  for (let f = 0; f < FILES_PER_WS; f++) {
    write(path.join(wsRoot, fileSubdir(f), `mod-${f}.ts`), fileBody(w, f));
  }
  writeConfigs(wsRoot);
}

if (COMMITS > 0) {
  console.log(`[gen] initialising git repo with ${COMMITS} synthetic commits...`);
  cp.execSync('git init -q', { cwd: ROOT, stdio: 'inherit' });
  cp.execSync('git config user.email perf@example.com', { cwd: ROOT });
  cp.execSync('git config user.name "Perf Reproducer"', { cwd: ROOT });
  cp.execSync('git add -A', { cwd: ROOT, stdio: 'inherit' });
  cp.execSync('git commit -q -m "initial"', { cwd: ROOT });
  for (let c = 1; c <= COMMITS; c++) {
    const w = c % N_WS;
    const f = c % FILES_PER_WS;
    const target = path.join(ROOT, WS_PARENT, `pkg-${String(w).padStart(3, '0')}`, fileSubdir(f), `mod-${f}.ts`);
    fs.appendFileSync(target, `// touch ${c}\n`);
    cp.execSync(`git -c user.email=perf${c % 5}@example.com -c user.name="Author${c % 5}" commit -q -am "touch ${c}"`, { cwd: ROOT });
  }
}

const totalFiles = N_WS * (FILES_PER_WS + 1);
console.log(`[gen] done. Wrote ~${totalFiles} TS files across ${N_WS} workspaces.`);
GEN_EOF
````

### Step 3 — generate the synthetic project (~1 minute, ~120 MB on disk)

```bash
node gen-monorepo.cjs \
  --workspaces 64 \
  --files-per-ws 256 \
  --plugins 'next,gatsby,remix,vite,vitest,webpack,parcel,rollup' \
  --imports-per-file 5 \
  --cross-ws-imports 5 \
  --workspace-pattern 'apps/**' \
  --workspace-parent apps \
  --nest-depth 3 \
  --commits 200 \
  --root .
```

Expected output ends with:

```
[gen] done. Wrote ~16448 TS files across 64 workspaces.
```

### Step 4 — reproduce the timings

```bash
echo '=== fallow dead-code ===' \
  && { time fallow dead-code --no-cache --performance --root . -q > /tmp/perf-dead-code.out 2>&1; } 2>&1 | grep real \
  && grep -E 'discover|workspaces|plugins|parse|entry|resolve|analyze|TOTAL' /tmp/perf-dead-code.out

echo && echo '=== fallow dupes (strict mode) ===' \
  && { time fallow dupes --no-cache --performance --root . -q > /tmp/perf-dupes.out 2>&1; } 2>&1 | grep real

echo && echo '=== fallow health (default) ===' \
  && { time fallow health --no-cache --performance --root . -q > /tmp/perf-health.out 2>&1; } 2>&1 | grep real \
  && grep -E 'discover|parse|complexity|file scores|git churn|hotspots|duplication|TOTAL' /tmp/perf-health.out

echo && echo '=== fallow (combined) ===' \
  && { time fallow --no-cache --performance --root . -q > /tmp/perf-bare.out 2>&1; } 2>&1 | grep real

echo && BASE=$(git rev-list --max-parents=0 HEAD) \
  && echo "=== fallow audit --base <initial> ===" \
  && { time fallow audit --base $BASE --no-cache --root . -q > /tmp/perf-audit.out 2>&1; } 2>&1 | grep real
```

Expected timings (Apple Silicon M3 Max, fallow 2.52.0, no `node_modules`):

| invocation | wall-clock |
|---|---:|
| `fallow dead-code` | ~9-13 s |
| `fallow dupes` | ~1-2 s |
| `fallow health` | ~10-14 s |
| `fallow` (combined) | ~12-15 s |
| `fallow audit --base <initial>` | **~25-32 s** |

### Expected behavior

It would be really useful if `fallow audit` could fit inside a pre-commit hook on a large monorepo. To be usable as a pre-commit hook, the cold run on a workload like this would ideally need to take **no longer than ~1-3 seconds** — anything beyond that and developers tend to bypass the hook with `--no-verify`.

I don't know fallow's internals well enough to predict what's actually achievable, but the per-stage breakdown above suggests several places where there might be some room. Some thoughts on each, in case any of them is helpful:

| stage | observed | thought |
|---|---:|---|
| `analyze` | 3.1 s | super-linear cost on barrel files (each workspace re-exports 256 modules from `index.ts`). Possibly parallelisable per-module |
| `resolve imports` | 2.9 s | might benefit from a persistent on-disk cache for resolved paths |
| `plugins` | 1.4 s | scaling looks like O(workspaces × matchers × files) — could possibly bucket files by workspace prefix |
| `duplication` | 1.1 s | possibly able to reuse the parse cache produced by `dead-code` instead of re-tokenising |
| `workspaces` | 1.0 s | the walker appears to recurse into directories that have already been recognised as workspaces |
| `git churn` | 0.5 s | runs even when `--hotspots` isn't requested, which `audit` doesn't request |
| `parse/extract` (health) | 0.5 s | duplicates work the `dead-code` pipeline already did in the same process |
| `discover files` (health) | 0.5 s | same — file discovery isn't shared across pipelines |
| `entry points` | 0.2 s | per-workspace discovery appears to be sequential |

### Fallow version

fallow 2.52.0

### Operating system

macOS

### Configuration

```toml
No `fallow.toml` or `.fallowrc.json` is required for this reproduction. The synthetic project ships only:

- `package.json` (with `workspaces: ['apps/**']`)
- `tsconfig.json` (basic ES2022 / Bundler resolution)
- per-workspace `package.json` listing 10 framework deps
- per-workspace plugin config files (`vite.config.ts`, `jest.config.js`, etc.)

No `node_modules` is installed for this reproduction. Installing it should not change the wall-clock numbers materially — the bottleneck is in the pipeline stages themselves, not in import resolution.
```

invocation	wall-clock	dominant stage(s)
`fallow dead-code`	9.4 s	`analyze` 3.1 s · `resolve imports` 2.9 s · `plugins` 1.4 s · `workspaces` 1.0 s
`fallow dupes` (strict)	1.5 s	suffix-array; no `--performance` block emitted
`fallow health` (default)	11.9 s	`duplication` 1.1 s · `git churn` 0.5 s · `parse` 0.5 s · `discover` 0.5 s
`fallow audit --base HEAD` (zero changes)	~1.1 s	short-circuits ✓
`fallow audit --base <initial-commit>` (full diff)	28.2 s	full `dead-code + dupes + health` pipeline
`fallow` bare (combined)	12.2 s	sum of `dead-code + dupes + health`

edge case	generator flag	what it stresses
64 workspaces	`--workspaces 64`	`workspaces` stage scaling
256 files per ws (~17k total)	`--files-per-ws 256`	`discover` + `parse` + `analyze` scaling
8 framework plugins per ws	`--plugins next,gatsby,remix,vite,vitest,webpack,parcel,rollup`	`plugins` stage scaling
Per-ws barrel `index.ts` re-exporting all 256 modules	(built-in)	`entry points` + `analyze` super-linear behaviour
5 intra-ws + 5 cross-ws imports per file	`--imports-per-file 5 --cross-ws-imports 5`	`resolve imports` + cross-workspace graph cost
Nested file layout (`src/g0/g1/g2/`)	`--nest-depth 3`	dir-walking heuristics
`apps/**` workspace pattern	`--workspace-pattern 'apps/**' --workspace-parent apps`	recursive workspace walker
200-commit git history	`--commits 200`	`git churn` / `--hotspots` cost

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fallow audit takes ~30s on a 17k-file / 64-workspace synthetic monorepo with many framework plugins #224

What happened?

Edge cases the reproduction stresses

Reproduction

Prerequisites

Step 1 — create a clean directory

Step 2 — save the generator (200 lines, ~9 KB)

Step 3 — generate the synthetic project (~1 minute, ~120 MB on disk)

Step 4 — reproduce the timings

Expected behavior

Fallow version

Operating system

Configuration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

invocation	wall-clock
`fallow dead-code`	~9-13 s
`fallow dupes`	~1-2 s
`fallow health`	~10-14 s
`fallow` (combined)	~12-15 s
`fallow audit --base <initial>`	~25-32 s

stage	observed	thought
`analyze`	3.1 s	super-linear cost on barrel files (each workspace re-exports 256 modules from `index.ts`). Possibly parallelisable per-module
`resolve imports`	2.9 s	might benefit from a persistent on-disk cache for resolved paths
`plugins`	1.4 s	scaling looks like O(workspaces × matchers × files) — could possibly bucket files by workspace prefix
`duplication`	1.1 s	possibly able to reuse the parse cache produced by `dead-code` instead of re-tokenising
`workspaces`	1.0 s	the walker appears to recurse into directories that have already been recognised as workspaces
`git churn`	0.5 s	runs even when `--hotspots` isn't requested, which `audit` doesn't request
`parse/extract` (health)	0.5 s	duplicates work the `dead-code` pipeline already did in the same process
`discover files` (health)	0.5 s	same — file discovery isn't shared across pipelines
`entry points`	0.2 s	per-workspace discovery appears to be sequential

fallow audit takes ~30s on a 17k-file / 64-workspace synthetic monorepo with many framework plugins #224

Description

What happened?

Edge cases the reproduction stresses

Reproduction

Prerequisites

Step 1 — create a clean directory

Step 2 — save the generator (200 lines, ~9 KB)

Step 3 — generate the synthetic project (~1 minute, ~120 MB on disk)

Step 4 — reproduce the timings

Expected behavior

Fallow version

Operating system

Configuration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions