Skip to content

Go snaggle each file in DIR#95

Merged
MusicalNinjaDad merged 19 commits into
mainfrom
go_snaggle
Nov 5, 2025
Merged

Go snaggle each file in DIR#95
MusicalNinjaDad merged 19 commits into
mainfrom
go_snaggle

Conversation

@MusicalNinjaDad

@MusicalNinjaDad MusicalNinjaDad commented Nov 5, 2025

Copy link
Copy Markdown
Owner

Summary by Sourcery

Make Snaggle concurrently process files with per-file locking, improve link error reporting, and add tests for link conflict scenarios

New Features:

  • Add a fileLocks mechanism to serialize concurrent link creation
  • Parallelize file processing in Snaggle using errgroup
  • Introduce TestLinkDifferentFile to validate link collision handling

Enhancements:

  • Wrap all link errors in fs.PathError to include operation and source path
  • Refactor Snaggle to accept a root directory and fileLocks, simplifying signature
  • Improve recursive directory scanning logic and error handling in Snaggle

Tests:

  • Enhance test logging to dump full error chains on failure
  • Add assertions in TestLinkDifferentFile for LinkError and SnaggleError details

Chores:

  • Update internal SameFile comment to mention lock waiting
  • Fix minor whitespace in copy.go

@sourcery-ai

sourcery-ai Bot commented Nov 5, 2025

Copy link
Copy Markdown

Reviewer's Guide

This PR refactors the core linking logic in snaggle.go to be safe under concurrent execution, improves error handling by wrapping all filesystem operations in typed errors, and overhauls the directory traversal to use errgroup for parallel file processing. It also introduces a locking mechanism to prevent duplicate link races and extends tests to cover link collision scenarios with richer error diagnostics.

Sequence diagram for concurrent-safe link creation

sequenceDiagram
    participant Snaggle
    participant errgroup
    participant snaggle
    participant link_
    participant fileLocks
    participant os
    Snaggle->>errgroup: Go(snaggle(path, ...))
    errgroup->>snaggle: run
    snaggle->>link_: link(path, targetDir, locks)
    link_->>fileLocks: lock(target)
    fileLocks->>fileLocks: acquire lock for target
    link_->>os: Link(sourcePath, target)
    link_->>fileLocks: unlock(target)
    fileLocks->>fileLocks: release lock for target
Loading

Class diagram for new fileLocks locking mechanism

classDiagram
    class fileLocks {
        +sync.RWMutex m
        +map<string, *sync.RWMutex> locks
        +lock(path string)
        +unlock(path string)
        +newFileLock()
    }
    fileLocks -- "*" sync.RWMutex : uses
    fileLocks -- "*" map : manages
    class sync.RWMutex
    class map
Loading

File-Level Changes

Change Details Files
Refactor link function for robust error handling
  • extended signature to accept fileLocks and track original path
  • introduced op variable to label each filesystem operation
  • wrapped all filesystem errors in fs.PathError with op and original path
  • unified logging and return flow for success and error cases
snaggle.go
Add fileLocks for concurrency-safe linking
  • defined fileLocks struct with per-path mutex map and global lock
  • implemented newFileLock(), lock(), unlock() for managing file-level locks
  • integrated locks into link() to serialize concurrent link operations
snaggle.go
Parallelize file processing in Snaggle using errgroup
  • introduced errgroup.Group in Snaggle() to schedule per-file tasks
  • created snagit wrapper to filter non-ELF errors and dispatch tasks
  • rewrote directory traversal to use WalkDir/ReadDir with errgroup.Go for both recursive and non-recursive modes
snaggle.go
Enhance snaggle helper for error wrapping and lock propagation
  • updated snaggle() signature to take root and locks
  • wrapped linkerrs.Wait() errors into SnaggleError for context
  • propagated locks parameter through all link() calls
  • simplified return flow to nil on success
snaggle.go
Extend tests for detailed error diagnostics and link collision cases
  • improved Test to log the full error chain when assertions fail
  • added TestLinkDifferentFile to validate fs.PathError and os.LinkError wrapping on collisions
snaggle_test.go
Update internal comparison doc comment
  • revised SameFile() comment to reflect lock-waiting behavior
internal/comparisons.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Remove the debug println calls in the fileLocks methods (or replace them with a proper logger) to avoid cluttering stdout.
  • Protect map reads in fileLocks.unlock under the same mutex as the writes (e.g. use fl.m.RLock) to avoid potential concurrent map access races.
  • Invoking errgroup.Wait() inside the WalkDir callback serializes the tasks; schedule all goroutines during the walk and call Wait() once after WalkDir returns to maximize concurrency.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Remove the debug `println` calls in the `fileLocks` methods (or replace them with a proper logger) to avoid cluttering stdout.
- Protect map reads in `fileLocks.unlock` under the same mutex as the writes (e.g. use `fl.m.RLock`) to avoid potential concurrent map access races.
- Invoking `errgroup.Wait()` inside the `WalkDir` callback serializes the tasks; schedule all goroutines during the walk and call `Wait()` once after `WalkDir` returns to maximize concurrency.

## Individual Comments

### Comment 1
<location> `snaggle.go:275` </location>
<code_context>
+
+// We need to lock a file while it is being copied. Othwerwise a second goroutine may attempt to
+// create the same link and fail because FileExists && !SameFile
+type fileLocks struct {
+	m     sync.RWMutex             // to avoid concurrent updates to fileLocks
+	locks map[string]*sync.RWMutex //keys: paths of locked files
</code_context>

<issue_to_address>
**issue (complexity):** Consider replacing the custom fileLocks and nested errgroup waits with a sync.Map of mutexes and a single errgroup.Wait to simplify concurrency and locking.

```suggestion
fileLocks and the current WalkDir+errgroup pattern add a lot of boilerplate and nested waits. You can simplify both by:

1) Replacing `fileLocks` with a `sync.Map` of `*sync.Mutex`  
2) Separating directory traversal from `errgroup.Wait()` (only wait once, after all `Go()` calls)

— file_locks.go —
```go
// global, no init needed
var fileLocks sync.Map // map[string]*sync.Mutex

func lock(path string) {
    muIface, _ := fileLocks.LoadOrStore(path, &sync.Mutex{})
    mu := muIface.(*sync.Mutex)
    mu.Lock()
}

func unlock(path string) {
    if muIface, ok := fileLocks.Load(path); ok {
        muIface.(*sync.Mutex).Unlock()
    }
}
```

— snaggle.go (Snaggle entry) —
```go
func Snaggle(rootPath, root string, opts ...Option) error {
    var (
        options  options
        g        errgroup.Group
    )
    for _, o := range opts { o(&options) }

    walkFn := func(path string, d fs.DirEntry, err error) error {
        if err != nil || d.IsDir() {
            return err // propagate or skip
        }
        g.Go(func() error { return snaggleFile(path, root, options) })
        return nil
    }

    info, err := os.Stat(rootPath)
    if err != nil {
        return err
    }
    if info.IsDir() {
        if options.recursive {
            if err := filepath.WalkDir(rootPath, walkFn); err != nil {
                return err
            }
        } else {
            entries, err := os.ReadDir(rootPath)
            if err != nil {
                return err
            }
            for _, e := range entries {
                if e.IsDir() {
                    continue
                }
                path := filepath.Join(rootPath, e.Name())
                g.Go(func() error { return snaggleFile(path, root, options) })
            }
        }
    } else {
        g.Go(func() error { return snaggleFile(rootPath, root, options) })
    }

    return g.Wait()
}
```

— link.go (inside `snaggleFile`) —
```go
func link(src, dstDir string) error {
    dst := filepath.Join(dstDir, filepath.Base(src))
    lock(dst)
    defer unlock(dst)

    // existing logic: EvalSymlinks, MkdirAll, os.Link or fallback Copy...
}
```

These changes keep per-file locking, remove manual `map+RWMutex`, eliminate debug `println`s, and only call `Wait()` once, making the code easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov

codecov Bot commented Nov 5, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.25000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.67%. Comparing base (9e37153) to head (64ff06a).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
snaggle.go 91.02% 6 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
+ Coverage   72.94%   74.67%   +1.72%     
==========================================
  Files          12       12              
  Lines         658      687      +29     
==========================================
+ Hits          480      513      +33     
+ Misses        123      120       -3     
+ Partials       55       54       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MusicalNinjaDad MusicalNinjaDad merged commit 7ca3e57 into main Nov 5, 2025
12 checks passed
@MusicalNinjaDad MusicalNinjaDad deleted the go_snaggle branch November 5, 2025 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant