Skip to content

fallow audit takes ~30s on a 17k-file / 64-workspace synthetic monorepo with many framework plugins #224

@OmerGronich

Description

@OmerGronich

What happened?

On a monorepo that combines several common edge cases (many workspaces, apps/** nesting, many framework plugins per workspace, large barrel index.ts files, no node_modules present), cold-cache runs of fallow, fallow dead-code, fallow health, and fallow audit --base <ref> take ~20–30 seconds. To make this independently reproducible and easy to profile, I built a synthetic monorepo generator (200-line Node script, inlined below) that exercises the same edge cases and produces the same wall-clock on a clean machine.

Timings (Apple Silicon M3 Max, fallow 2.52.0, --no-cache, no node_modules):

invocation wall-clock dominant stage(s)
fallow dead-code 9.4 s analyze 3.1 s · resolve imports 2.9 s · plugins 1.4 s · workspaces 1.0 s
fallow dupes (strict) 1.5 s suffix-array; no --performance block emitted
fallow health (default) 11.9 s duplication 1.1 s · git churn 0.5 s · parse 0.5 s · discover 0.5 s
fallow audit --base HEAD (zero changes) ~1.1 s short-circuits ✓
fallow audit --base <initial-commit> (full diff) 28.2 s full dead-code + dupes + health pipeline
fallow bare (combined) 12.2 s sum of dead-code + dupes + health

fallow --performance block:

┌─ Pipeline Performance (dead-code) ──────────────────
│  discover files:      239.5ms  (16896 files)
│  workspaces:         1003.1ms  (64 workspaces)
│  plugins:            1360.0ms
│  script analysis:       0.9ms
│  parse/extract:       340.3ms  (16896 modules)
│  cache update:          0.0ms
│  entry points:        240.1ms  (448 entries)
│  resolve imports:    2905.8ms
│  build graph:          20.4ms
│  analyze:            3107.1ms
│  TOTAL:              9224.2ms
└─────────────────────────────────────────────────────

┌─ Health Pipeline Performance ───────────────────────
│  discover files:      466.7ms          ← repeated work; not shared with dead-code
│  parse/extract:       465.3ms          ← repeated work; not shared with dead-code
│  complexity:            4.8ms
│  file scores:          69.5ms
│  git churn:           488.9ms (cold)
│  hotspots:              4.4ms
│  duplication:        1059.8ms
│  TOTAL:             10896.6ms
└─────────────────────────────────────────────────────

Edge cases the reproduction stresses

edge case generator flag what it stresses
64 workspaces --workspaces 64 workspaces stage scaling
256 files per ws (~17k total) --files-per-ws 256 discover + parse + analyze scaling
8 framework plugins per ws --plugins next,gatsby,remix,vite,vitest,webpack,parcel,rollup plugins stage scaling
Per-ws barrel index.ts re-exporting all 256 modules (built-in) entry points + analyze super-linear behaviour
5 intra-ws + 5 cross-ws imports per file --imports-per-file 5 --cross-ws-imports 5 resolve imports + cross-workspace graph cost
Nested file layout (src/g0/g1/g2/) --nest-depth 3 dir-walking heuristics
apps/** workspace pattern --workspace-pattern 'apps/**' --workspace-parent apps recursive workspace walker
200-commit git history --commits 200 git churn / --hotspots cost

Reproduction

Prerequisites

  • macOS or Linux
  • Node 18+ (v20.20.0 was used)
  • Bash, Git
  • fallow 2.52.0 (free tier, no license needed)
  • ~500 MB free disk
  • ~1 minute setup time

Step 1 — create a clean directory

mkdir /tmp/fallow-perf-repro && cd /tmp/fallow-perf-repro

Step 2 — save the generator (200 lines, ~9 KB)

cat > gen-monorepo.cjs <<'GEN_EOF'
#!/usr/bin/env node
/**
 * Synthetic monorepo generator for fallow performance reproductions.
 *
 * Usage:
 *   node gen-monorepo.cjs --workspaces 80 --files-per-ws 325 --root .
 *
 * All flags are documented in the inline comments below.
 */
const fs = require('node:fs');
const path = require('node:path');
const cp = require('node:child_process');

function arg(name, def) {
  const i = process.argv.indexOf(name);
  if (i === -1) return def;
  const v = process.argv[i + 1];
  return v === undefined ? true : v;
}

const N_WS = parseInt(arg('--workspaces', '20'), 10);
const FILES_PER_WS = parseInt(arg('--files-per-ws', '50'), 10);
const ROOT = path.resolve(arg('--root', '.'));
const IMPORTS_PER_FILE = parseInt(arg('--imports-per-file', '3'), 10);
const CROSS_WS = parseInt(arg('--cross-ws-imports', '0'), 10);
const COMMITS = parseInt(arg('--commits', '0'), 10);
const WRITE_CONFIGS = arg('--no-config-files', false) !== true;
const NEST_DEPTH = parseInt(arg('--nest-depth', '0'), 10);

const DEFAULT_PLUGINS = ['typescript', '@nx/workspace', 'eslint'];
const KITCHEN_SINK_PLUGINS = [
  'vite', 'vitest', 'jest', '@storybook/react', '@nx/workspace',
  '@angular/core', 'next', 'playwright', 'cypress', 'eslint',
  'tailwindcss', 'react', 'react-router', 'remix', 'gatsby',
  'nuxt', 'astro', 'rollup', 'webpack', 'parcel'
];
let pluginArg = arg('--plugins', DEFAULT_PLUGINS.join(','));
if (pluginArg === 'kitchen-sink') pluginArg = KITCHEN_SINK_PLUGINS.join(',');
const PLUGINS = String(pluginArg).split(',').filter(Boolean);

const PLUGIN_CONFIGS = {
  vite:        { file: 'vite.config.ts',          body: 'export default { plugins: [] };\n' },
  vitest:      { file: 'vitest.config.ts',        body: 'export default { test: { globals: true } };\n' },
  jest:        { file: 'jest.config.js',          body: 'module.exports = { testEnvironment: "node" };\n' },
  '@storybook/react': { file: '.storybook/main.ts', body: 'export default { stories: ["../src/**/*.stories.tsx"] };\n' },
  '@nx/workspace': { file: 'project.json',        body: '{ "name": "PKG", "sourceRoot": "src" }\n' },
  '@angular/core': { file: 'angular.json',        body: '{ "version": 1, "projects": {} }\n' },
  next:        { file: 'next.config.js',          body: 'module.exports = {};\n' },
  playwright:  { file: 'playwright.config.ts',    body: 'export default { testDir: "./e2e" };\n' },
  cypress:     { file: 'cypress.config.ts',       body: 'export default { e2e: { baseUrl: "http://localhost" } };\n' },
  eslint:      { file: '.eslintrc.cjs',           body: 'module.exports = { root: false, rules: {} };\n' },
  tailwindcss: { file: 'tailwind.config.js',      body: 'module.exports = { content: ["./src/**/*.{ts,tsx}"] };\n' },
  react:       null,
  'react-router': null,
  remix:       { file: 'remix.config.js',         body: 'module.exports = { ignoredRouteFiles: ["**/.*"] };\n' },
  gatsby:      { file: 'gatsby-config.ts',        body: 'export default { plugins: [] };\n' },
  nuxt:        { file: 'nuxt.config.ts',          body: 'export default {};\n' },
  astro:       { file: 'astro.config.mjs',        body: 'export default {};\n' },
  rollup:      { file: 'rollup.config.js',        body: 'export default { input: "src/index.ts", output: { file: "dist/index.js" } };\n' },
  webpack:     { file: 'webpack.config.js',       body: 'module.exports = { entry: "./src/index.ts" };\n' },
  parcel:      { file: '.parcelrc',               body: '{ "extends": "@parcel/config-default" }\n' }
};

function mkdir(p) { fs.mkdirSync(p, { recursive: true }); }
function write(p, body) { mkdir(path.dirname(p)); fs.writeFileSync(p, body); }

// Pinned, well-known stable versions so the maintainer can run npm install
// safely against the public npm registry. Pinning avoids any risk of pulling
// a pre-release with surprising postinstall scripts.
const PLUGIN_VERSIONS = {
  vite: '5.4.10', vitest: '1.6.1', jest: '29.7.0',
  '@storybook/react': '8.4.0', '@nx/workspace': '20.0.0', '@angular/core': '18.2.0',
  next: '14.2.18', playwright: '1.48.0', cypress: '13.15.0',
  eslint: '9.13.0', tailwindcss: '3.4.14', react: '18.3.1', 'react-router': '6.27.0',
  remix: '1.19.3', gatsby: '5.13.7', nuxt: '3.13.2', astro: '4.16.0',
  rollup: '4.24.0', webpack: '5.95.0', parcel: '2.12.0',
  mocha: '10.7.3', ava: '6.1.3', typescript: '5.6.3',
};

function workspacePackage(i) {
  const deps = {};
  for (const p of PLUGINS) deps[p] = PLUGIN_VERSIONS[p] || '0.0.0';
  return {
    name: `@repro/pkg-${String(i).padStart(3, '0')}`,
    version: '0.0.0', private: true,
    main: 'src/index.ts', types: 'src/index.ts',
    dependencies: deps
  };
}

function fileBody(wsIdx, fileIdx) {
  const lines = [];
  for (let k = 1; k <= IMPORTS_PER_FILE; k++) {
    const target = (fileIdx + k) % FILES_PER_WS;
    if (target === fileIdx) continue;
    lines.push(`import { fn${target} } from "./mod-${target}";`);
  }
  for (let k = 0; k < CROSS_WS && N_WS > 1; k++) {
    const otherWs = (wsIdx + k + 1) % N_WS;
    lines.push(`import { fn0 as cross${k} } from "@repro/pkg-${String(otherWs).padStart(3, '0')}";`);
  }
  lines.push('');
  lines.push(`export function fn${fileIdx}(input: number): number {`);
  lines.push('  let acc = input;');
  lines.push(`  if (acc > 10) acc += 1; else acc -= 1;`);
  lines.push(`  for (let i = 0; i < 3; i++) acc *= 2;`);
  lines.push('  return acc;');
  lines.push('}');
  lines.push(`export const CONST_${fileIdx} = ${fileIdx};`);
  return lines.join('\n') + '\n';
}

function fileSubdir(fileIdx) {
  if (NEST_DEPTH <= 0) return 'src';
  const segs = ['src'];
  let n = fileIdx;
  for (let d = 0; d < NEST_DEPTH; d++) {
    segs.push(`g${n % 5}`);
    n = Math.floor(n / 5);
  }
  return segs.join('/');
}

function indexBody(wsIdx) {
  const lines = [];
  for (let i = 0; i < FILES_PER_WS; i++) lines.push(`export * from "./mod-${i}";`);
  return lines.join('\n') + '\n';
}

function writeConfigs(wsRoot) {
  if (!WRITE_CONFIGS) return;
  for (const p of PLUGINS) {
    const cfg = PLUGIN_CONFIGS[p];
    if (!cfg) continue;
    write(path.join(wsRoot, cfg.file), cfg.body.replace(/PKG/g, path.basename(wsRoot)));
  }
}

const WS_PARENT = String(arg('--workspace-parent', 'packages'));
const WORKSPACE_PATTERN = String(arg('--workspace-pattern', `${WS_PARENT}/*`));

console.log(`[gen] root=${ROOT} workspaces=${N_WS} files/ws=${FILES_PER_WS} plugins=${PLUGINS.length} pattern=${WORKSPACE_PATTERN} nest=${NEST_DEPTH}`);
mkdir(ROOT);
write(path.join(ROOT, 'package.json'), JSON.stringify({
  name: 'fallow-perf-repro', version: '0.0.0', private: true,
  workspaces: [WORKSPACE_PATTERN],
  devDependencies: { typescript: '5.0.0' }
}, null, 2) + '\n');
write(path.join(ROOT, 'tsconfig.json'), JSON.stringify({
  compilerOptions: { target: 'ES2022', module: 'ESNext', moduleResolution: 'Bundler', strict: false, baseUrl: '.', paths: {} }
}, null, 2) + '\n');

for (let w = 0; w < N_WS; w++) {
  const wsRoot = path.join(ROOT, WS_PARENT, `pkg-${String(w).padStart(3, '0')}`);
  write(path.join(wsRoot, 'package.json'), JSON.stringify(workspacePackage(w), null, 2) + '\n');
  write(path.join(wsRoot, 'src', 'index.ts'), indexBody(w));
  for (let f = 0; f < FILES_PER_WS; f++) {
    write(path.join(wsRoot, fileSubdir(f), `mod-${f}.ts`), fileBody(w, f));
  }
  writeConfigs(wsRoot);
}

if (COMMITS > 0) {
  console.log(`[gen] initialising git repo with ${COMMITS} synthetic commits...`);
  cp.execSync('git init -q', { cwd: ROOT, stdio: 'inherit' });
  cp.execSync('git config user.email perf@example.com', { cwd: ROOT });
  cp.execSync('git config user.name "Perf Reproducer"', { cwd: ROOT });
  cp.execSync('git add -A', { cwd: ROOT, stdio: 'inherit' });
  cp.execSync('git commit -q -m "initial"', { cwd: ROOT });
  for (let c = 1; c <= COMMITS; c++) {
    const w = c % N_WS;
    const f = c % FILES_PER_WS;
    const target = path.join(ROOT, WS_PARENT, `pkg-${String(w).padStart(3, '0')}`, fileSubdir(f), `mod-${f}.ts`);
    fs.appendFileSync(target, `// touch ${c}\n`);
    cp.execSync(`git -c user.email=perf${c % 5}@example.com -c user.name="Author${c % 5}" commit -q -am "touch ${c}"`, { cwd: ROOT });
  }
}

const totalFiles = N_WS * (FILES_PER_WS + 1);
console.log(`[gen] done. Wrote ~${totalFiles} TS files across ${N_WS} workspaces.`);
GEN_EOF

Step 3 — generate the synthetic project (~1 minute, ~120 MB on disk)

node gen-monorepo.cjs \
  --workspaces 64 \
  --files-per-ws 256 \
  --plugins 'next,gatsby,remix,vite,vitest,webpack,parcel,rollup' \
  --imports-per-file 5 \
  --cross-ws-imports 5 \
  --workspace-pattern 'apps/**' \
  --workspace-parent apps \
  --nest-depth 3 \
  --commits 200 \
  --root .

Expected output ends with:

[gen] done. Wrote ~16448 TS files across 64 workspaces.

Step 4 — reproduce the timings

echo '=== fallow dead-code ===' \
  && { time fallow dead-code --no-cache --performance --root . -q > /tmp/perf-dead-code.out 2>&1; } 2>&1 | grep real \
  && grep -E 'discover|workspaces|plugins|parse|entry|resolve|analyze|TOTAL' /tmp/perf-dead-code.out

echo && echo '=== fallow dupes (strict mode) ===' \
  && { time fallow dupes --no-cache --performance --root . -q > /tmp/perf-dupes.out 2>&1; } 2>&1 | grep real

echo && echo '=== fallow health (default) ===' \
  && { time fallow health --no-cache --performance --root . -q > /tmp/perf-health.out 2>&1; } 2>&1 | grep real \
  && grep -E 'discover|parse|complexity|file scores|git churn|hotspots|duplication|TOTAL' /tmp/perf-health.out

echo && echo '=== fallow (combined) ===' \
  && { time fallow --no-cache --performance --root . -q > /tmp/perf-bare.out 2>&1; } 2>&1 | grep real

echo && BASE=$(git rev-list --max-parents=0 HEAD) \
  && echo "=== fallow audit --base <initial> ===" \
  && { time fallow audit --base $BASE --no-cache --root . -q > /tmp/perf-audit.out 2>&1; } 2>&1 | grep real

Expected timings (Apple Silicon M3 Max, fallow 2.52.0, no node_modules):

invocation wall-clock
fallow dead-code ~9-13 s
fallow dupes ~1-2 s
fallow health ~10-14 s
fallow (combined) ~12-15 s
fallow audit --base <initial> ~25-32 s

Expected behavior

It would be really useful if fallow audit could fit inside a pre-commit hook on a large monorepo. To be usable as a pre-commit hook, the cold run on a workload like this would ideally need to take no longer than ~1-3 seconds — anything beyond that and developers tend to bypass the hook with --no-verify.

I don't know fallow's internals well enough to predict what's actually achievable, but the per-stage breakdown above suggests several places where there might be some room. Some thoughts on each, in case any of them is helpful:

stage observed thought
analyze 3.1 s super-linear cost on barrel files (each workspace re-exports 256 modules from index.ts). Possibly parallelisable per-module
resolve imports 2.9 s might benefit from a persistent on-disk cache for resolved paths
plugins 1.4 s scaling looks like O(workspaces × matchers × files) — could possibly bucket files by workspace prefix
duplication 1.1 s possibly able to reuse the parse cache produced by dead-code instead of re-tokenising
workspaces 1.0 s the walker appears to recurse into directories that have already been recognised as workspaces
git churn 0.5 s runs even when --hotspots isn't requested, which audit doesn't request
parse/extract (health) 0.5 s duplicates work the dead-code pipeline already did in the same process
discover files (health) 0.5 s same — file discovery isn't shared across pipelines
entry points 0.2 s per-workspace discovery appears to be sequential

Fallow version

fallow 2.52.0

Operating system

macOS

Configuration

No `fallow.toml` or `.fallowrc.json` is required for this reproduction. The synthetic project ships only:

- `package.json` (with `workspaces: ['apps/**']`)
- `tsconfig.json` (basic ES2022 / Bundler resolution)
- per-workspace `package.json` listing 10 framework deps
- per-workspace plugin config files (`vite.config.ts`, `jest.config.js`, etc.)

No `node_modules` is installed for this reproduction. Installing it should not change the wall-clock numbers materially — the bottleneck is in the pipeline stages themselves, not in import resolution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions