`health_score` returns 0 for 5/11 penalty dimensions and is capped on 2/11 at large-monorepo scale

### What happened?

On a large multi-package TypeScript monorepo, `fallow health --hotspots --score` returns a `health_score` in the **B grade band** while only **4 of 11 penalty dimensions** track reality. The other 7 are mathematically incapable of firing at this scale:

| Dimension          | Cap | State                                                                |
| ------------------ | --: | -------------------------------------------------------------------- |
| `dead_files`       |  15 | ✅ honest                                                            |
| `dead_exports`     |  15 | ✅ honest                                                            |
| `complexity`       |  20 | ✅ honest                                                            |
| `duplication`      |  10 | ✅ honest                                                            |
| `p90_complexity`   |  10 | ⚫ silent (`p90_cyc` well below the `> 10` trigger)                  |
| `maintainability`  |  15 | ⚫ silent (`MI_avg` well above the `< 70` trigger)                   |
| `hotspots`         |  10 | ⚫ silent (max ranked score reaches a fraction of the `50.0` filter) |
| `unit_size`        |  10 | ⚫ silent (`very_high_risk %` below the `≥ 5 %` floor)               |
| `coupling`         |   5 | ⚫ silent (`p95_fan_in` well below the `> 30` trigger)               |
| `unused_deps`      |  10 | 🔴 saturated (actual count an order of magnitude over the cap)       |
| `circular_deps`    |  10 | 🔴 saturated (actual count well over an order of magnitude over)     |
| **total**          | 130 | score lands in the **B** band                                        |

**One pattern explains all 7 broken dimensions: scale-blind aggregations + low absolute caps.**

- The 5 silent dimensions aggregate per-function/per-file metrics with `mean / p90 / fixed-percentage` operators, then trigger on a fixed threshold tuned for small/medium projects. At scale, the long tail is mathematically swallowed by the bulk of trivial code (most TS files are tiny utility/barrel/model files; most functions are 1-CC getters and lambdas), so the aggregation never crosses the floor:
  - Tens of thousands of functions live above `p90`, but `p90` itself sits well below `> 10`.
  - A meaningful absolute count of files have `MI < 70`, but they're a tiny fraction of the total, so the mean is near 100.
  - Thousands of functions exceed 60 LOC, but they're below the 5 % floor of the function-count denominator.
  - Thousands of files are ranked as hotspots, but the within-project `max-norm` formula at [`compute_hotspot_score`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health/hotspots.rs) (`(churn/max_churn) × (density/max_density) × 100`) is structurally bounded — see §"Related" below.
  - `p95_fan_in` lands in the single digits because the bottom 95 % of files are barely imported; the actually-coupled barrels live above p99.
- The 2 saturated dimensions use [`min(count, 10)`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L343-L356) on per-repo counts. Reasonable for a single-package project; a no-op in any workspace where N packages multiply the count linearly. The formula treats `n=11` and `n=1000` identically.

Net: **~38 % of the penalty budget (50/130 pts) is silently zero, ~15 % (20/130) is pinned at the cap regardless of magnitude**. A codebase with thousands of fat functions, hundreds of cycles, and hundreds of unused deps reads as **B / mostly healthy**.

### Per-dimension evidence

#### `p90_complexity` — [`vital_signs.rs:319`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L319)
- `clamp(p90_cyclomatic − 10, 0, 10)`. At large function-population sizes, the bulk are trivial; complex functions live above p99. A `p99_cyclomatic` (same trigger) or `functions_with_cc_above_20 / 1k_functions` would survive.

#### `maintainability` — [`vital_signs.rs:323-325`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L323-L325)
- `min((70 − MI_avg).max(0) × 0.5, 15)`. Over 98 % of files have `MI ≥ 70`, dragging the mean above the trigger. The actionable signal is the small absolute count with `MI < 70` — invisible to a mean. `maintainability_p10` or `count(MI < 70)` would survive.

#### `hotspots` — [`vital_signs.rs:331-340`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L331-L340) + [`scores.rs:4`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health_types/scores.rs#L4)
- Penalty: `min(hotspot_count / total_files × 200, 10)` where `hotspot_count = files with score ≥ HOTSPOT_SCORE_THRESHOLD (= 50.0)`.
- Score: `(weighted_commits / max_weighted) × (complexity_density / max_density) × 100`.
- Max-norm caps every score at `1.0 × 1.0 × 100 = 100` only if a single file is both max-churned and max-density. In practice the top-churned file has moderate density and vice-versa, so the product is structurally bounded **well below `50.0`**. Top-ranked hotspot reaches less than half of the threshold → `hotspot_count = 0` even though thousands of files are ranked. Either expose the threshold or count "top N % of the within-project ranking".

#### `unit_size` — [`vital_signs.rs:359-365`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L359-L365)
- `min((very_high_risk_pct − 5).max(0) × 0.5, 10)`, `very_high_risk = % of functions > 60 LOC`. A substantial absolute inventory of functions over 60 LOC stays invisible because it's a small fraction of a large function-count denominator. Lower the floor (~1 %) or switch to `functions_over_60_loc / 1k_functions`.

#### `coupling` — [`vital_signs.rs:368-373`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L368-L373)
- `min((p95_fan_in − 30).max(0) × 0.25, 5)`. Fan-in is heavy-tailed. `p95` is in the single digits because the bottom 95 % is barely imported — not because there are no hubs. `p99_fan_in` (same trigger) or the already-computed `coupling_high_pct` ([`vital_signs.rs:285`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L285)) would work.

#### `unused_deps` & `circular_deps` (saturated) — [`vital_signs.rs:343-356`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L343-L356)
- `min(count, 10)` for both. `unused_dep_count` exceeds the cap by an order of magnitude; `circular_dep_count` by well over an order of magnitude. Counts grow ~linearly with workspace package count; the cap was reasonable for a single-package project but is a no-op in any monorepo. Recommended replacement: per-1k-files density.

### Recommended fix: scale-invariant aggregations as the new default

**A metric should ask "what fraction of your code is bad?" — not "are you big enough to dilute the bad code below a threshold?"**

| Dimension          | Current scale-blind aggregator                  | Scale-invariant replacement                                     |
| ------------------ | ----------------------------------------------- | --------------------------------------------------------------- |
| `complexity`       | `avg_cyclomatic` (mean over all functions)      | `count(cc ≥ critical) / 1k functions`                           |
| `p90_complexity`   | `p90_cyclomatic > 10`                           | (subsumed by complexity tail metric — drop)                     |
| `maintainability`  | `mean(MI) < 70`                                 | `% of files with MI < 70`                                       |
| `hotspots`         | `count(score ≥ 50) / total_files × 200`         | `top 1 % of within-project hotspot ranking / total_files × 200` |
| `unit_size`        | `% of functions > 60 LOC, trigger > 5 %`        | `count(functions > 60 LOC) / 1k functions`                      |
| `coupling`         | `p95_fan_in − 30`                               | `coupling_high_pct` (already computed)                          |
| `unused_deps`      | `min(count, 10)`                                | `count / 1k files × 0.5, cap 25`                                |
| `circular_deps`    | `min(count, 10)`                                | `count / 1k files × 0.5, cap 25`                                |

Every replacement is **scale-invariant by construction** — bigger codebases neither get a leniency dividend nor a size penalty. A 1K-file project and a 100K-file project with the same _density_ of bad code score identically.

A small project (e.g. 1K files, single-digit unused deps, 1-2 cycles) sees a small _improvement_ under the new densities, not a regression — density-based aggregators are simultaneously small-project-friendly and monorepo-honest.

#### Fallback ask (if changing defaults is too invasive)

If shipping these as new defaults moves every existing user's grade, the minimum useful change is **expose the scale-invariant primitives as new `vital_signs` fields** alongside the existing scale-blind ones — so dashboards and CI gates can compute honest scores externally:

```text
vital_signs.functions_above_critical_cc_per_k   // replaces avg_cyc + p90_cyc
vital_signs.functions_above_60_loc_per_k        // replaces unit_size very_high_risk
vital_signs.maintainability_pct_below_70        // replaces maintainability_avg
vital_signs.hotspots_top_pct_count              // replaces hotspot_count
vital_signs.unused_deps_per_k_files             // replaces saturated unused_dep_count
vital_signs.circular_deps_per_k_files           // replaces saturated circular_dep_count
```

Doesn't move any grade, lets large monorepos compute honest scores externally. Strictly worse than fixing the defaults (fallow's own `health_score` would still report B when the data says D), but the smallest useful change.

### Configurability audit (none of this is tunable today)

[`HealthConfig`](https://github.com/fallow-rs/fallow/blob/main/crates/config/src/config/health.rs#L101-L156): the only score-relevant knob is `health.ignore` (denominator filter). All seven broken-dimension constants are hardcoded:

| Constant                             | Location                                                                                                                     | Value  |
| ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ------ |
| `HOTSPOT_SCORE_THRESHOLD`            | [`scores.rs:4`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health_types/scores.rs#L4)                      | `50.0` |
| `MI_DENSITY_MIN_LINES`               | [`scores.rs:24`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health_types/scores.rs#L24)                    | `50.0` |
| Per-dimension caps + floors          | [`vital_signs.rs:295-404`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L295-L404)            | inline |
| Aggregator choice (mean / p90 / p95) | [`vital_signs.rs:69-288`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L69-L288)              | inline |
| Hotspot half-life                    | [`crates/core/src/churn.rs`](https://github.com/fallow-rs/fallow/blob/main/crates/core/src/churn.rs) (`HALF_LIFE_DAYS = 90`) | `90`   |

`HealthConfig.maxCyclomatic` / `maxCognitive` / `maxCrap` only affect **finding emission**, not the score — confirmed in [`compute_health_score`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L295), which never reads them. CLI flags `--since` / `--min-commits` widen the hotspot window but don't affect `HOTSPOT_SCORE_THRESHOLD` or the max-norm. **No `.fallowrc.json` or CLI combination can move this score from B to its honest grade** — scale-blindness lives in source-level constants.

### Related: upstream signal defects

Two broken dimensions have defects in the **upstream signal**, not just in how the score consumes them. Even with `compute_health_score()` fixed, these will remain silent until the upstream signal is also addressed. Happy to file as companion issues.

1. **Hotspot scoring algorithm has a structural ceiling well below the threshold.** [`compute_hotspot_score`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health/hotspots.rs#L116-L133) returns `(weighted_commits / max_weighted) × (complexity_density / max_density) × 100`. To reach 100 (or even 50), one file must be both max-churned _and_ max-density. In real codebases the top-churned file has moderate density and vice-versa, so the product is structurally bounded well below the `HOTSPOT_SCORE_THRESHOLD = 50.0` filter at [`vital_signs.rs:131`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L131). On any sufficiently large repo, top-ranked hotspots reach only a fraction of 50 → `hotspot_count` is always 0. A percentile-based filter ("files in the top 1 % of the within-project hotspot ranking") would survive max-norm compression.

2. **MI per-file formula's small-file dampening pushes most files to MI ≥ 70.** [`compute_maintainability_index`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health/scoring.rs#L874-L891) is `100 − density × 30 × dampening − dead_ratio × 20 − min(ln1p(fan_out) × 4, 15)` where `dampening = min(lines / MI_DENSITY_MIN_LINES, 1.0)`. Files under 50 LOC (barrels, models, utility) get density damped toward 0, pinning their MI near 100 regardless of internal complexity. **Result: well over 98 % of scored files end up with MI ≥ 70 on any TS-heavy codebase.** Fixing the score-formula aggregator alone helps but per-file MI is still inflated.

### Why this matters

[`scores.rs:26-49`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/health_types/scores.rs#L26-L49) describes `health_score` as a comprehensible `0–100` summary suitable for dashboards and CI gates. With 5/11 dimensions silently 0 and 2/11 saturated, the score is structurally unable to communicate "really, really bad" for any sufficiently large project. The underlying data is excellent; the problem is in how the score formula aggregates it.

### Reproduction

The bug is **deterministic in the formula** — given inputs in the shape produced by any large TS monorepo, `compute_health_score()` returns a B-band score with five `0.0` penalties and two saturated `10.0` penalties. No real codebase required.

### Easiest: drop a unit test into fallow's own test suite

Following the existing pattern in [`vital_signs.rs:1135+`](https://github.com/fallow-rs/fallow/blob/main/crates/cli/src/vital_signs.rs#L1135) (`health_score_perfect`, etc.):

```rust
#[test]
fn health_score_silent_and_saturated_at_monorepo_scale() {
    // Inputs in the shape produced by any large multi-package TS monorepo.
    // Small perturbations don't change the qualitative result.
    let total_files: usize = 25_000;
    let vs = VitalSigns {
        // honest dimensions
        dead_file_pct:      Some(4.0),
        dead_export_pct:    Some(9.0),
        avg_cyclomatic:     2.3,
        duplication_pct:    Some(6.0),

        // silent dimensions — every value is "long-tail-hidden"
        p90_cyclomatic:     4,
        maintainability_avg:Some(91.0),  // mean dominated by small files
        hotspot_count:      Some(0),     // none cross HOTSPOT_SCORE_THRESHOLD = 50
        unit_size_profile:  Some(RiskProfile { very_high_risk: 2.3, ..Default::default() }),
        p95_fan_in:         Some(7),

        // saturated dimensions — counts grow with workspace package count
        unused_dep_count:   Some(180),
        circular_dep_count: Some(450),
        ..Default::default()
    };
    let score = compute_health_score(&vs, total_files);
    let p = &score.penalties;

    // 4 honest dimensions
    assert!(p.dead_files.unwrap()   > 0.0 && p.dead_files.unwrap()   < 5.0);
    assert!(p.dead_exports.unwrap() > 0.0 && p.dead_exports.unwrap() < 5.0);
    assert!(p.complexity            > 0.0 && p.complexity            < 10.0);
    assert!(p.duplication.unwrap()  > 0.0 && p.duplication.unwrap()  < 5.0);

    // 5 silent dimensions
    assert_eq!(p.p90_complexity,            0.0);
    assert_eq!(p.maintainability.unwrap(),  0.0);
    assert_eq!(p.hotspots.unwrap(),         0.0);
    assert_eq!(p.unit_size.unwrap(),        0.0);
    assert_eq!(p.coupling.unwrap(),         0.0);

    // 2 saturated dimensions
    assert_eq!(p.unused_deps.unwrap(),      10.0);
    assert_eq!(p.circular_deps.unwrap(),    10.0);

    assert_eq!(score.grade, "B");
}
```

Self-contained, runs in milliseconds. Same test with the recommended scale-invariant aggregators should drop the score by roughly one and a half letter grades (B → D).

### End-to-end: synthetic monorepo generator

The script below produces a fully synthetic TS workspace whose `vital_signs` reproduces the broken-dimension pattern end-to-end. Defaults generate ~21K files in ~3.5 min (mostly git churn); `--commits-per-fat-file=2` runs in under a minute with the same pattern. Smaller `--packages` / `--files-per-pkg` produce the partial pattern (3-4 silent dimensions).

```bash
node generate-monorepo.mjs ./repro                      # defaults: 80 pkgs × 250 files
cd ./repro && fallow health --hotspots --score --format json --quiet \
  | jq '.health.health_score.{score, grade, penalties}, .health.vital_signs'
```

Expected at defaults: score in the **C band (~65)**, **5 of 11 penalties at `0.0`** (`p90_complexity`, `maintainability`, `unit_size`, `coupling`, plus `dead_files` / `dead_exports` since synthetic data has no deads), **2 saturated at `10.0`** (`unused_deps`, `circular_deps`). Bumping `--fat-fns-per-pkg` past 5 silences hotspots and lifts the score into B.

<details>
<summary><code>generate-monorepo.mjs</code> — click to expand</summary>

```javascript
#!/usr/bin/env node
// Reproduces fallow's health_score scale-blindness pattern (5 silent + 2 saturated).
// Usage: node generate-monorepo.mjs <out-dir> [--packages=80] [--files-per-pkg=250]
//        [--fat-fns-per-pkg=5] [--cycles-per-pkg=6] [--unused-deps-per-pkg=3]
//        [--commits-per-fat-file=8]

import { mkdirSync, writeFileSync, existsSync, rmSync } from 'node:fs';
import { execSync } from 'node:child_process';
import { join } from 'node:path';

const args = Object.fromEntries(process.argv.slice(2).filter(a => a.startsWith('--'))
  .map(a => { const [k, v] = a.replace(/^--/, '').split('='); return [k, v ?? true]; }));
const outDir = process.argv.find((a, i) => i > 1 && !a.startsWith('--')) ?? './repro';
const PACKAGES            = Number(args.packages              ?? 80);
const FILES_PER_PKG       = Number(args['files-per-pkg']      ?? 250);
const FAT_FNS_PER_PKG     = Number(args['fat-fns-per-pkg']    ?? 5);
const CYCLES_PER_PKG      = Number(args['cycles-per-pkg']     ?? 6);
const UNUSED_DEPS_PER_PKG = Number(args['unused-deps-per-pkg']?? 3);
const COMMITS_PER_FAT     = Number(args['commits-per-fat-file'] ?? 8);

if (existsSync(outDir)) rmSync(outDir, { recursive: true, force: true });
mkdirSync(outDir, { recursive: true });

writeFileSync(join(outDir, 'package.json'), JSON.stringify({
  name: 'fallow-repro', private: true,
  workspaces: Array.from({ length: PACKAGES }, (_, i) => `packages/pkg-${i}`),
}, null, 2));
writeFileSync(join(outDir, 'tsconfig.json'), JSON.stringify({
  compilerOptions: { target: 'ES2022', module: 'ESNext', moduleResolution: 'bundler', strict: true, skipLibCheck: true },
}, null, 2));

// Trivial file = 1 trivial fn (1-CC). Drives p90_cyc mean, very_high_risk %, MI mean.
const trivial = (p, i) =>
  `// pkg-${p} v${i}\nexport function get_v${i}_${p}(): number { return ${i} + ${p}; }\nexport const v${i}_${p} = ${i * (p + 1)};\n`;

// Fat file = 1 nested-switch fn → high CC, > 60 LOC.
const fat = (p, idx) => {
  const branches = Array.from({ length: 12 }, (_, b) => `    case ${b}: { switch (mode) {
      case 'a': return ${b} * 2 + ${p}; case 'b': return ${b} + 1 - ${idx};
      case 'c': return ${b} - 1 * ${p}; case 'd': return ${b} ** 2 + ${idx};
      default: return ${b} + ${p}; } }`).join('\n');
  return `export function fatFn_${p}_${idx}(input: number, mode: 'a'|'b'|'c'|'d'): number {
  switch (input) {\n${branches}\n    default: return input + ${p};\n  }\n}\n`;
};

// Cycle pair = two intra-package files importing each other. One pair = one cycle.
const cycA = (p, i) => `import { b_${p}_${i} } from './cycle-${i}-b';\nexport const a_${p}_${i} = b_${p}_${i} + ${p};\n`;
const cycB = (p, i) => `import { a_${p}_${i} } from './cycle-${i}-a';\nexport const b_${p}_${i} = a_${p}_${i} + ${i};\n`;

const barrel = (p) => {
  const L = [];
  for (let i = 0; i < FILES_PER_PKG; i++)   L.push(`export * from './v${i}';`);
  for (let f = 0; f < FAT_FNS_PER_PKG; f++) L.push(`export * from './fat-${f}';`);
  for (let c = 0; c < CYCLES_PER_PKG; c++)  { L.push(`export * from './cycle-${c}-a';`); L.push(`export * from './cycle-${c}-b';`); }
  return L.join('\n') + '\n';
};

// Pool of public packages claimed as deps but never imported.
const POOL = ['lodash', 'rxjs', 'date-fns', 'uuid', 'chalk', 'yargs', 'minimist', 'zod', 'axios', 'commander'];

for (let p = 0; p < PACKAGES; p++) {
  const dir = join(outDir, 'packages', `pkg-${p}`);
  mkdirSync(join(dir, 'src'), { recursive: true });
  const devDeps = {};
  for (let u = 0; u < UNUSED_DEPS_PER_PKG; u++) devDeps[POOL[(p + u) % POOL.length]] = '*';
  writeFileSync(join(dir, 'package.json'), JSON.stringify({
    name: `pkg-${p}`, version: '0.0.0', main: './src/barrel.ts', types: './src/barrel.ts',
    devDependencies: devDeps,
  }, null, 2));
  for (let f = 0; f < FILES_PER_PKG;   f++) writeFileSync(join(dir, 'src', `v${f}.ts`),    trivial(p, f));
  for (let f = 0; f < FAT_FNS_PER_PKG; f++) writeFileSync(join(dir, 'src', `fat-${f}.ts`), fat(p, f));
  for (let c = 0; c < CYCLES_PER_PKG;  c++) {
    writeFileSync(join(dir, 'src', `cycle-${c}-a.ts`), cycA(p, c));
    writeFileSync(join(dir, 'src', `cycle-${c}-b.ts`), cycB(p, c));
  }
  writeFileSync(join(dir, 'src', 'barrel.ts'), barrel(p));
}

// Hotspots need git history: commit-burst on each fat file.
const sh = (cmd) => execSync(cmd, { cwd: outDir, stdio: ['ignore', 'ignore', 'inherit'] });
sh('git init -q -b main && git config user.email s@s && git config user.name s && git add . && git -c commit.gpgsign=false commit -q -m init');
let date = new Date('2024-01-01T00:00:00Z').getTime();
for (let p = 0; p < PACKAGES; p++) for (let f = 0; f < FAT_FNS_PER_PKG; f++) {
  const path = `packages/pkg-${p}/src/fat-${f}.ts`;
  for (let c = 0; c < COMMITS_PER_FAT; c++) {
    sh(`printf '\\n// tweak ${c}\\n' >> "${path}"`);
    sh(`git -c commit.gpgsign=false -c user.email=s@s -c user.name=s commit -q --allow-empty-message --date="${new Date(date).toISOString()}" -am tweak`);
    date += 6 * 60 * 60 * 1000 + Math.floor(Math.random() * 6 * 60 * 60 * 1000);
  }
}
console.log(`Done. cd ${outDir} && fallow health --hotspots --score --format json --quiet | jq '.health.health_score'`);
```

</details>

Knob → score-formula-input mapping:

| Knob                       | Drives                                                                |
| -------------------------- | --------------------------------------------------------------------- |
| `--packages`               | Workspace package count → `unused_deps` & `circular_deps` saturation  |
| `--files-per-pkg`          | Total file count → silences `unit_size`, `maintainability`, `hotspots`|
| `--fat-fns-per-pkg`        | Fat-function tail (invisible to `mean / p90 / fixed-percent`)         |
| `--cycles-per-pkg`         | Intra-package cycle count → `circular_dep_count`                      |
| `--unused-deps-per-pkg`    | `unused_dep_count` per package                                        |
| `--commits-per-fat-file`   | Hotspot churn distribution                                            |

### Optional: against a real codebase

```bash
cd <large-ts-monorepo>
git fetch --unshallow      # so hotspots have a real distribution (default --since 6m)
fallow health --hotspots --score --format json --quiet \
  | jq '.health.health_score.penalties, .health.vital_signs'
```

Look for: dimensions reporting `0.0` in `.penalties` paired with non-zero "bulk" inputs in `.vital_signs` (`p90_cyclomatic > 0`, `maintainability_avg > 0`, `unit_size_profile.very_high_risk > 0`, `p95_fan_in > 0`), plus `unused_deps` / `circular_deps` pegged at `10`.

### Expected behavior

The formula behaves exactly as written, the case for changing it: at large scale the calibration produces **structurally false signal** — not "wrong by a few points" but "5 of 11 dimensions cannot fire under any input distribution this codebase shape will produce", and "11 vs 1,000 unused deps score identically".

`health_score` should:

1. Fire on a codebase containing thousands of fat functions, a measurable absolute count of files with `MI < 70`, and thousands of ranked hotspots.
2. Differentiate small vs medium vs large vs catastrophic dep / cycle counts rather than collapsing them all to 10 pts.
3. Produce different letter grades for repos with order-of-magnitude differences in bad-code volume.

### Fallow version

fallow 2.62.0

### Operating system

macOS

### Configuration

```toml
default
```

Dimension	Current scale-blind aggregator	Scale-invariant replacement
`complexity`	`avg_cyclomatic` (mean over all functions)	`count(cc ≥ critical) / 1k functions`
`p90_complexity`	`p90_cyclomatic > 10`	(subsumed by complexity tail metric — drop)
`maintainability`	`mean(MI) < 70`	`% of files with MI < 70`
`hotspots`	`count(score ≥ 50) / total_files × 200`	`top 1 % of within-project hotspot ranking / total_files × 200`
`unit_size`	`% of functions > 60 LOC, trigger > 5 %`	`count(functions > 60 LOC) / 1k functions`
`coupling`	`p95_fan_in − 30`	`coupling_high_pct` (already computed)
`unused_deps`	`min(count, 10)`	`count / 1k files × 0.5, cap 25`
`circular_deps`	`min(count, 10)`	`count / 1k files × 0.5, cap 25`

Constant	Location	Value
`HOTSPOT_SCORE_THRESHOLD`	`scores.rs:4`	`50.0`
`MI_DENSITY_MIN_LINES`	`scores.rs:24`	`50.0`
Per-dimension caps + floors	`vital_signs.rs:295-404`	inline
Aggregator choice (mean / p90 / p95)	`vital_signs.rs:69-288`	inline
Hotspot half-life	`crates/core/src/churn.rs` (`HALF_LIFE_DAYS = 90`)	`90`

Knob	Drives
`--packages`	Workspace package count → `unused_deps` & `circular_deps` saturation
`--files-per-pkg`	Total file count → silences `unit_size`, `maintainability`, `hotspots`
`--fat-fns-per-pkg`	Fat-function tail (invisible to `mean / p90 / fixed-percent`)
`--cycles-per-pkg`	Intra-package cycle count → `circular_dep_count`
`--unused-deps-per-pkg`	`unused_dep_count` per package
`--commits-per-fat-file`	Hotspot churn distribution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`health_score` returns 0 for 5/11 penalty dimensions and is capped on 2/11 at large-monorepo scale #260

What happened?

Per-dimension evidence

`p90_complexity` — `vital_signs.rs:319`

`maintainability` — `vital_signs.rs:323-325`

`hotspots` — `vital_signs.rs:331-340` + `scores.rs:4`

`unit_size` — `vital_signs.rs:359-365`

`coupling` — `vital_signs.rs:368-373`

`unused_deps` & `circular_deps` (saturated) — `vital_signs.rs:343-356`

Recommended fix: scale-invariant aggregations as the new default

Fallback ask (if changing defaults is too invasive)

Configurability audit (none of this is tunable today)

Related: upstream signal defects

Why this matters

Reproduction

Easiest: drop a unit test into fallow's own test suite

End-to-end: synthetic monorepo generator

Optional: against a real codebase

Expected behavior

Fallow version

Operating system

Configuration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dimension	Cap	State
`dead_files`	15	✅ honest
`dead_exports`	15	✅ honest
`complexity`	20	✅ honest
`duplication`	10	✅ honest
`p90_complexity`	10	⚫ silent (`p90_cyc` well below the `> 10` trigger)
`maintainability`	15	⚫ silent (`MI_avg` well above the `< 70` trigger)
`hotspots`	10	⚫ silent (max ranked score reaches a fraction of the `50.0` filter)
`unit_size`	10	⚫ silent (`very_high_risk %` below the `≥ 5 %` floor)
`coupling`	5	⚫ silent (`p95_fan_in` well below the `> 30` trigger)
`unused_deps`	10	🔴 saturated (actual count an order of magnitude over the cap)
`circular_deps`	10	🔴 saturated (actual count well over an order of magnitude over)
total	130	score lands in the B band

health_score returns 0 for 5/11 penalty dimensions and is capped on 2/11 at large-monorepo scale #260

Description

What happened?

Per-dimension evidence

p90_complexity — vital_signs.rs:319

maintainability — vital_signs.rs:323-325

hotspots — vital_signs.rs:331-340 + scores.rs:4

unit_size — vital_signs.rs:359-365

coupling — vital_signs.rs:368-373

unused_deps & circular_deps (saturated) — vital_signs.rs:343-356

Recommended fix: scale-invariant aggregations as the new default

Fallback ask (if changing defaults is too invasive)

Configurability audit (none of this is tunable today)

Related: upstream signal defects

Why this matters

Reproduction

Easiest: drop a unit test into fallow's own test suite

End-to-end: synthetic monorepo generator

Optional: against a real codebase

Expected behavior

Fallow version

Operating system

Configuration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`health_score` returns 0 for 5/11 penalty dimensions and is capped on 2/11 at large-monorepo scale #260

`p90_complexity` — `vital_signs.rs:319`

`maintainability` — `vital_signs.rs:323-325`

`hotspots` — `vital_signs.rs:331-340` + `scores.rs:4`

`unit_size` — `vital_signs.rs:359-365`

`coupling` — `vital_signs.rs:368-373`

`unused_deps` & `circular_deps` (saturated) — `vital_signs.rs:343-356`