Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: fallow-rs/fallow
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v2.58.0
Choose a base ref
...
head repository: fallow-rs/fallow
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v2.59.0
Choose a head ref
  • 7 commits
  • 57 files changed
  • 1 contributor

Commits on May 1, 2026

  1. Configuration menu
    Copy the full SHA
    bf864c3 View commit details
    Browse the repository at this point in the history
  2. chore: consolidate worktree rules and project communication into CLAU…

    …DE.md
    
    Add three sections to project CLAUDE.md, lifted from feedback memories that
    were duplicating the same rules across multiple files:
    
    - Project communication: full-scope messaging, honest comparisons, design
      spec purity (3 rules)
    - Repo layout: fallow-2/fallow split, .internal/ symlink target, vendored
      npm/fallow/skills/, gh repo flag, fallow vs fallow check (6 rules)
    - Worktree / parallel-agent rules: commit WIP early, verify authors, no
      push-through-dirty-worktree, combined.rs contention, fmt after
      cherry-pick, conflict markers, cargo cache, missing commits (7 rules)
    
    Memories archived under ~/.claude/projects/.../memory/_archive/.
    BartWaardenburg committed May 1, 2026
    Configuration menu
    Copy the full SHA
    5b22d28 View commit details
    Browse the repository at this point in the history
  3. feat(dupes): persistent token cache, focused-mode shingle prefilter, …

    …audit base-snapshot skip
    
    - Add per-project token cache at .fallow/cache/dupes-tokens-v2/, threshold-gated
      by duplicates.minCorpusSizeForTokenCache (default 5000). Bitcode-encoded with
      TokenKind round-trip; auto-writes .gitignore on save.
    - Add k-token shingle prefilter for focused-mode runs (audit, --changed-since
      dupes), threshold-gated by duplicates.minCorpusSizeForShingleFilter (default
      1024). Drops unchanged files whose shingles do not overlap any focused file
      before suffix array construction.
    - Add audit base-snapshot fast path: when every changed file is either a
      non-behavioral doc or token-equivalent at the base ref, reuse current run's
      results as the base snapshot and skip the second worktree analysis. Surfaced
      via base_snapshot_skipped under --performance.
    - Bump CACHE_VERSION; persist TokenKind losslessly via bitcode derive.
    - Tighten is_fallow_cache_artifact to use opts.root/.fallow with canonical
      fallback so symlinked-tempdir tests stay correct.
    
    Resolves #243 perf direction.
    BartWaardenburg committed May 1, 2026
    Configuration menu
    Copy the full SHA
    3a6c407 View commit details
    Browse the repository at this point in the history
  4. perf(dupes): wire --changed-since to focused fast path; coalesce Inte…

    …rvalIndex inserts
    
    `fallow dupes --changed-since` was running a full-corpus suffix-array scan and
    post-filtering instead of engaging the focused-mode fast path that already
    existed for `audit`. The standalone CLI hardcoded `changed_files: None` so
    `find_duplicates_touching_files` was never reached. Resolve `--changed-since`
    to a concrete file set up front and pass it through; the existing post-filter
    becomes a no-op safety net.
    
    `IntervalIndex::insert` claimed an "ascending start order per slot" invariant
    that neither caller satisfied: both `remove_token_subsets` and
    `remove_line_subsets` process groups in length-/token-DESC order, so per-slot
    inserts arrived in arbitrary offset order. The single-prev-merge logic
    fragmented intervals and produced false negatives in `is_covered`, keeping
    more groups than necessary and bloating the slot vecs. Replace with a
    coalesce-on-insert that merges every existing interval touching or
    overlapping the new range.
    
    Also harden `path_to_idx` indexing in `remove_line_subsets` to use `.get()`
    with a `tracing::error!` skip path, removing an unconditional panic site
    flagged in the issue (no repro available, but the indexing was unsafe).
    
    Add per-step `tracing::debug!` breakdown inside `build_groups` so future perf
    work has subsecond-level visibility into where time goes.
    
    Measured on MUI master (16k tokenized files, 3.2M tokens, 639k raw groups):
    
      fallow dupes --no-cache:
        before: 41.4s total, build_groups 31.5s
        after:  17.8s total, build_groups 8.1s
        token_subset_us: 24.5s -> 0.94s (-26x)
      fallow dupes --changed-since HEAD~5:
        before: 42.8s; after: 1.3s (-33x)
      fallow audit --base HEAD~5 duplication step:
        before: 0.95s; after: 0.63s
      fallow dupes --no-cache on next.js fixture:
        before: 2.66s; after: 2.20s
    
    Output equivalence: MUI full-corpus dropped from 329,165 to 329,163
    duplicated lines (12 groups removed). Inspection confirms the previous
    fragmentation was masking a small number of legitimate line-level subsets.
    The 3 extra groups that focused-mode now finds for `--changed-since` match
    what `audit` already produces and are real clones touching changed files
    that the old post-filter pass was hiding via cross-corpus subsumption.
    
    Refs #243.
    BartWaardenburg committed May 1, 2026
    Configuration menu
    Copy the full SHA
    b864bb9 View commit details
    Browse the repository at this point in the history
  5. refactor(cli): share rayon global pool config across CLI and NAPI ent…

    …ry points
    
    Extract `rayon::ThreadPoolBuilder::build_global()` into a single
    `rayon_pool::configure_global_pool(threads)` helper that pins worker
    stack size to 16 MiB (deep visitor and graph traversals overflow Rust's
    default 8 MiB worker stack on large real-world projects). Apply it from
    both `main.rs::validate_inputs` and `programmatic.rs::AnalysisOptions`,
    so embedded NAPI consumers get the same thread count and stack size as
    the CLI rather than inheriting Rayon's defaults.
    
    Includes a stack-probe regression test that recurses 5,000 frames in a
    worker to assert the pinned stack size holds.
    BartWaardenburg committed May 1, 2026
    Configuration menu
    Copy the full SHA
    58ba403 View commit details
    Browse the repository at this point in the history
  6. feat(dupes): default-ignore generated framework output (.next, .nuxt,…

    … .turbo, ...)
    
    fallow dupes now skips **/.next/**, **/.nuxt/**, **/.svelte-kit/**, **/.turbo/**, **/.parcel-cache/**, **/.vite/**, **/.cache/**, **/out/**, and **/storybook-static/** before tokenization. Authored-looking lib/, legacy/, and nested build/ directories stay in scope. Defaults merge with duplicates.ignore; set duplicates.ignoreDefaults: false to opt out.
    
    Human and markdown output show a one-line skipped-file note; --explain-skipped expands to per-pattern counts. JSON, SARIF, CodeClimate, and compact output stay unchanged.
    
    fallow init now scaffolds a commented-out [duplicates] block with common ignore additions (lib/, legacy/, __generated__/, generated/), and the JSON variant now parses as JSONC end-to-end.
    BartWaardenburg committed May 1, 2026
    Configuration menu
    Copy the full SHA
    c0f780e View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    d106438 View commit details
    Browse the repository at this point in the history
Loading