Go snaggle each file in DIR#95
Conversation
Reviewer's GuideThis PR refactors the core linking logic in Sequence diagram for concurrent-safe link creationsequenceDiagram
participant Snaggle
participant errgroup
participant snaggle
participant link_
participant fileLocks
participant os
Snaggle->>errgroup: Go(snaggle(path, ...))
errgroup->>snaggle: run
snaggle->>link_: link(path, targetDir, locks)
link_->>fileLocks: lock(target)
fileLocks->>fileLocks: acquire lock for target
link_->>os: Link(sourcePath, target)
link_->>fileLocks: unlock(target)
fileLocks->>fileLocks: release lock for target
Class diagram for new fileLocks locking mechanismclassDiagram
class fileLocks {
+sync.RWMutex m
+map<string, *sync.RWMutex> locks
+lock(path string)
+unlock(path string)
+newFileLock()
}
fileLocks -- "*" sync.RWMutex : uses
fileLocks -- "*" map : manages
class sync.RWMutex
class map
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- Remove the debug
printlncalls in thefileLocksmethods (or replace them with a proper logger) to avoid cluttering stdout. - Protect map reads in
fileLocks.unlockunder the same mutex as the writes (e.g. usefl.m.RLock) to avoid potential concurrent map access races. - Invoking
errgroup.Wait()inside theWalkDircallback serializes the tasks; schedule all goroutines during the walk and callWait()once afterWalkDirreturns to maximize concurrency.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Remove the debug `println` calls in the `fileLocks` methods (or replace them with a proper logger) to avoid cluttering stdout.
- Protect map reads in `fileLocks.unlock` under the same mutex as the writes (e.g. use `fl.m.RLock`) to avoid potential concurrent map access races.
- Invoking `errgroup.Wait()` inside the `WalkDir` callback serializes the tasks; schedule all goroutines during the walk and call `Wait()` once after `WalkDir` returns to maximize concurrency.
## Individual Comments
### Comment 1
<location> `snaggle.go:275` </location>
<code_context>
+
+// We need to lock a file while it is being copied. Othwerwise a second goroutine may attempt to
+// create the same link and fail because FileExists && !SameFile
+type fileLocks struct {
+ m sync.RWMutex // to avoid concurrent updates to fileLocks
+ locks map[string]*sync.RWMutex //keys: paths of locked files
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the custom fileLocks and nested errgroup waits with a sync.Map of mutexes and a single errgroup.Wait to simplify concurrency and locking.
```suggestion
fileLocks and the current WalkDir+errgroup pattern add a lot of boilerplate and nested waits. You can simplify both by:
1) Replacing `fileLocks` with a `sync.Map` of `*sync.Mutex`
2) Separating directory traversal from `errgroup.Wait()` (only wait once, after all `Go()` calls)
— file_locks.go —
```go
// global, no init needed
var fileLocks sync.Map // map[string]*sync.Mutex
func lock(path string) {
muIface, _ := fileLocks.LoadOrStore(path, &sync.Mutex{})
mu := muIface.(*sync.Mutex)
mu.Lock()
}
func unlock(path string) {
if muIface, ok := fileLocks.Load(path); ok {
muIface.(*sync.Mutex).Unlock()
}
}
```
— snaggle.go (Snaggle entry) —
```go
func Snaggle(rootPath, root string, opts ...Option) error {
var (
options options
g errgroup.Group
)
for _, o := range opts { o(&options) }
walkFn := func(path string, d fs.DirEntry, err error) error {
if err != nil || d.IsDir() {
return err // propagate or skip
}
g.Go(func() error { return snaggleFile(path, root, options) })
return nil
}
info, err := os.Stat(rootPath)
if err != nil {
return err
}
if info.IsDir() {
if options.recursive {
if err := filepath.WalkDir(rootPath, walkFn); err != nil {
return err
}
} else {
entries, err := os.ReadDir(rootPath)
if err != nil {
return err
}
for _, e := range entries {
if e.IsDir() {
continue
}
path := filepath.Join(rootPath, e.Name())
g.Go(func() error { return snaggleFile(path, root, options) })
}
}
} else {
g.Go(func() error { return snaggleFile(rootPath, root, options) })
}
return g.Wait()
}
```
— link.go (inside `snaggleFile`) —
```go
func link(src, dstDir string) error {
dst := filepath.Join(dstDir, filepath.Base(src))
lock(dst)
defer unlock(dst)
// existing logic: EvalSymlinks, MkdirAll, os.Link or fallback Copy...
}
```
These changes keep per-file locking, remove manual `map+RWMutex`, eliminate debug `println`s, and only call `Wait()` once, making the code easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #95 +/- ##
==========================================
+ Coverage 72.94% 74.67% +1.72%
==========================================
Files 12 12
Lines 658 687 +29
==========================================
+ Hits 480 513 +33
+ Misses 123 120 -3
+ Partials 55 54 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary by Sourcery
Make Snaggle concurrently process files with per-file locking, improve link error reporting, and add tests for link conflict scenarios
New Features:
Enhancements:
Tests:
Chores: