Skip to content

fix(wecom): fix context leak in Start() and data race in processedMsgs#918

Merged
yinwm merged 1 commit intosipeed:mainfrom
alexhoshina:fix/wecom-resource-leaks
Mar 1, 2026
Merged

fix(wecom): fix context leak in Start() and data race in processedMsgs#918
yinwm merged 1 commit intosipeed:mainfrom
alexhoshina:fix/wecom-resource-leaks

Conversation

@alexhoshina
Copy link
Collaborator

📝 Description

Cancel the constructor-created context before overwriting in Start() to prevent the original cancel function from becoming unreachable.

Move len(processedMsgs) check inside the write lock to eliminate a data race, and re-insert the current msgID after map reset to prevent duplicate processing of the in-flight message.

Applies to both WeComBotChannel and WeComAppChannel.

Cancel the constructor-created context before overwriting in Start()
to prevent the original cancel function from becoming unreachable.

Move len(processedMsgs) check inside the write lock to eliminate a
data race, and re-insert the current msgID after map reset to prevent
duplicate processing of the in-flight message.

Applies to both WeComBotChannel and WeComAppChannel.
Copilot AI review requested due to automatic review settings February 28, 2026 14:09
@alexhoshina alexhoshina review requested due to automatic review settings February 28, 2026 14:10
Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix(wecom): fix context leak in Start() and data race in processedMsgs

Excellent fix addressing two real concurrency bugs. Both changes apply to WeComBotChannel and WeComAppChannel symmetrically.

Bug 1: Context Leak in Start()

The constructor creates a context (context.WithCancel) that gets overwritten in Start() without canceling the original. This leaks the original context's goroutine and resources.

Fix is correct: Cancel the constructor-created context before overwriting:

if c.cancel != nil {
    c.cancel()
}
c.ctx, c.cancel = context.WithCancel(ctx)

The nil check is important for safety, though in practice the constructor always sets cancel.

Bug 2: Data Race in processedMsgs

The original code had a subtle race:

c.processedMsgs[msgID] = true
c.msgMu.Unlock()          // unlocks here

if len(c.processedMsgs) > 1000 {  // reads map WITHOUT lock
    c.msgMu.Lock()
    c.processedMsgs = make(map[string]bool)  // could race with another goroutine's read
    c.msgMu.Unlock()
}

Fix is correct: Moving the len() check and map reset inside the existing write lock eliminates the TOCTOU race. Re-inserting the current msgID after map reset is a nice touch -- without it, the in-flight message could be processed again by a concurrent goroutine between the reset and the next lock acquisition.

Quality

  • Both fixes are applied identically to app.go and bot.go, maintaining symmetry.
  • The changes are minimal and surgical -- no unnecessary refactoring.
  • The logic is easy to verify by inspection.

Minor Suggestion

Consider using sync.Map for processedMsgs if the access pattern is mostly reads (checking for duplicates) with occasional writes. But that is an optimization, not a correctness issue -- the current fix is correct.

LGTM. Well-analyzed concurrency fixes.

Copy link
Collaborator

@xiaket xiaket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaket xiaket added type: bug Something isn't working domain: channel labels Mar 1, 2026
Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yinwm yinwm merged commit 33f67e8 into sipeed:main Mar 1, 2026
6 checks passed
vvr3ddy added a commit to vvr3ddy/picoclaw that referenced this pull request Mar 1, 2026
Changes from upstream:
- fix(channel): config cleanup and regex precompile (sipeed#911, sipeed#916)
- fix(github_copilot): improve error handling (sipeed#919)
- fix(wecom): context leaks and data race fixes (sipeed#914, sipeed#918)
- fix(tools): HTTP client caching and response body cleanup (sipeed#940)
- feat(tui): Add configurable Launcher and Gateway process management (sipeed#909)
- feat(migrate): Update migration system with openclaw support (sipeed#910)
- fix(skills): Use registry-backed search for skills discovery (sipeed#929)
- build: Add armv6 support to goreleaser (sipeed#905)
- docs: Sync READMEs and channel documentation
hyperwd pushed a commit to hyperwd/picoclaw that referenced this pull request Mar 5, 2026
fix(wecom): fix context leak in Start() and data race in processedMsgs
Pluckypan pushed a commit to Pluckypan/picoclaw that referenced this pull request Mar 6, 2026
fix(wecom): fix context leak in Start() and data race in processedMsgs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: channel type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants