Skip to content

Improve ignored file checking performance with in-memory cache #4814

@eamodio

Description

@eamodio

Summary

Improves performance of ignored file checking by replacing shell-based git check-ignore execution with an in-memory cache, significantly reducing overhead when determining whether files are ignored by Git.

Impact

  • Dramatically reduces overhead for ignored file checks (from process spawn to cache lookup)
  • Improves responsiveness when working with large file sets
  • Reduces CPU usage and eliminates repeated Git process spawns
  • Benefits all features that check ignore status: file diffs, working directory changes, staging operations, etc.

Technical Details

Before: Each ignored file check spawned git check-ignore process via shell
After: In-memory cache with invalidation strategy

Implementation

  • Cache stores gitignore patterns per repository
  • Invalidates cache on .gitignore file changes
  • Falls back to Git command when cache miss or invalidated
  • Handles nested gitignore files correctly
  • Supports workspace-level .git/info/exclude patterns

Validation

Functional Testing

  1. Open repository with complex .gitignore (100+ patterns)
  2. Create 50 new files (some matching ignore patterns, some not)
  3. Verify GitLens correctly identifies ignored vs. non-ignored files
  4. Modify .gitignore → verify cache invalidates and new patterns apply
  5. Test nested .gitignore files in subdirectories
  6. Test .git/info/exclude patterns

Performance Testing

  1. Benchmark time to check 1000 files for ignore status:
    • Before: ~X seconds (shell spawn overhead)
    • After: <100ms (cache lookup)
  2. Monitor CPU usage during large file operations
  3. Test with repository containing 100,000+ files

Edge Cases

  1. Empty .gitignore → verify cache works
  2. No .gitignore present → verify graceful handling
  3. .gitignore with invalid syntax → verify error handling
  4. Concurrent ignore checks → verify cache thread-safety
  5. Repository with multiple worktrees → verify cache per-worktree

Risk

Low — Performance optimization with clear fallback path. Potential concerns:

  • Cache invalidation correctness (missing gitignore changes)
  • Memory usage with large gitignore patterns
  • Race conditions with concurrent file operations
  • Behavior parity with git check-ignore

Follow Ups

  • Add metrics/telemetry to measure performance improvement
  • Explore further optimizations (pattern compilation, bloom filters)
  • Consider exposing cache statistics for debugging
  • Investigate pre-warming cache on repository open

Metadata

Metadata

Assignees

Labels

area-gitIssues or features related to using Gitneeds-verificationRequest for verificationpending-releaseResolved but not yet released to the stable edition

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions