Skip to content

fix(tools): close resp.Body on retry cancel and cache http.Client instances#940

Merged
xiaket merged 6 commits intosipeed:mainfrom
ItsT0ng:fix/http-client-leaks-and-caching
Mar 1, 2026
Merged

fix(tools): close resp.Body on retry cancel and cache http.Client instances#940
xiaket merged 6 commits intosipeed:mainfrom
ItsT0ng:fix/http-client-leaks-and-caching

Conversation

@ItsT0ng
Copy link
Contributor

@ItsT0ng ItsT0ng commented Mar 1, 2026

📝 Description

Summary

  • Fix response body leak: DoRequestWithRetry incorrectly called req.Body.Close() (request body) instead of resp.Body.Close() (response body) when context was canceled during retry sleep — leaking the open HTTP response.
  • Cache http.Client on web tool providers: Move client creation from per-call in Search()/Execute() to once at construction time for BraveSearchProvider, TavilySearchProvider, DuckDuckGoSearchProvider, PerplexitySearchProvider, and WebFetchTool. Constructors with proxy config now return errors for fail-fast validation.
  • Cache http.Client on channel adapters: Add a shared *http.Client field to WeComBotChannel, WeComAppChannel, and LINEChannel, eliminating 6 per-request client allocations across sendWebhookReply, uploadMedia, sendImageMessage, sendTextMessage, fetchBotInfo, and callAPI.
  • Preserve original timeouts: LINE uses two cached clients (infoClient at 10s, apiClient at 30s) matching the original per-call values. WeCom uses max(30s, config.ReplyTimeout) so the per-request context deadline is always the effective limit — never silently truncated by the client timeout.

Changes

File What
pkg/utils/http_retry.go Close resp.Body (not req.Body) on context cancel
pkg/utils/http_retry_test.go Add TestDoRequestWithRetry_ContextCancel with body-close tracking
pkg/tools/web.go Add client field to 5 structs; create once in constructors
pkg/agent/loop.go Handle error returns from updated constructors
pkg/tools/web_test.go Update call sites for (*Tool, error) signatures
pkg/channels/wecom/bot.go Cache client with max(30s, ReplyTimeout), use in sendWebhookReply
pkg/channels/wecom/app.go Cache client with max(30s, ReplyTimeout), use in 3 send/upload methods
pkg/channels/line/line.go Two cached clients: infoClient (10s) + apiClient (30s)

🗣️ Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 📖 Documentation update
  • ⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

  • 🤖 Fully AI-generated (100% AI, 0% Human)
  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
  • 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

📚 Technical Context (Skip for Docs)

  • Reference URL:
  • Reasoning:

🧪 Test Environment

  • Hardware: PC
  • OS: Ubuntu 22.04
  • Model/Provider: N/A
  • Channels:

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

…tances

Fix resp.Body leak in DoRequestWithRetry where req.Body (request) was
incorrectly closed instead of resp.Body (response) on context cancel.
Cache http.Client on web search/fetch provider structs and channel
adapters (WeCom, LINE) to avoid per-call allocation overhead.
Copilot AI review requested due to automatic review settings March 1, 2026 03:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an HTTP response-body leak in retry logic and refactors web tool providers and several channel adapters to reuse http.Client instances instead of allocating new clients per request.

Changes:

  • Fix DoRequestWithRetry to close resp.Body when retry sleep is interrupted by context cancellation.
  • Cache *http.Client instances in web search/fetch providers and in WeCom/LINE channel adapters.
  • Update constructors/call sites to handle new error-returning constructors for web tools.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/utils/http_retry.go Close resp.Body on retry-sleep cancellation to prevent leaks
pkg/tools/web.go Cache clients in providers; constructors now return errors for fail-fast proxy validation
pkg/tools/web_test.go Update tests to handle new constructor signatures
pkg/agent/loop.go Handle (tool, err) returns and log constructor failures
pkg/channels/wecom/bot.go Add shared client and reuse it for webhook replies
pkg/channels/wecom/app.go Add shared client and reuse it for upload/send calls
pkg/channels/line/line.go Add shared client and reuse it for bot info + API calls
Comments suppressed due to low confidence (2)

pkg/channels/wecom/app.go:316

  • uploadMedia no longer creates a dedicated client with a 30s timeout; it now uses the shared client (60s) and does not apply any per-request context timeout. This changes timeout behavior (and could potentially hang longer than before if ctx has no deadline). Consider adding context.WithTimeout here (matching the previous 30s) or otherwise enforcing a bounded timeout per upload request.
	req, err := http.NewRequestWithContext(ctx, http.MethodPost, apiURL, body)
	if err != nil {
		return "", fmt.Errorf("failed to create request: %w", err)
	}
	req.Header.Set("Content-Type", writer.FormDataContentType())

	resp, err := c.client.Do(req)
	if err != nil {
		return "", channels.ClassifyNetError(err)
	}
	defer resp.Body.Close()

pkg/channels/wecom/app.go:141

  • This shared http.Client uses a fixed 60s Timeout. Requests like sendImageMessage/sendTextMessage use per-request context timeouts derived from cfg.ReplyTimeout; a shorter client.Timeout will override longer ReplyTimeout values. Consider setting client.Timeout=0 and relying on per-request context deadlines (or at least making client.Timeout >= ReplyTimeout).
	ctx, cancel := context.WithCancel(context.Background())
	return &WeComAppChannel{
		BaseChannel:   base,
		config:        cfg,
		client:        &http.Client{Timeout: 60 * time.Second},
		ctx:           ctx,
		cancel:        cancel,
		processedMsgs: make(map[string]bool),
	}, nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Split LINE single 60s client into infoClient (10s) for bot info lookups
and apiClient (30s) for messaging API calls. Lower WeCom cached client
base timeout from 60s to 30s (matching uploadMedia), and ensure it is
always >= the configured ReplyTimeout so the per-request context
deadline remains the effective limit.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@xiaket xiaket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits.

ItsT0ng added 2 commits March 1, 2026 16:40
…constructors

Address PR review feedback from xiaket:
- Define searchTimeout, perplexityTimeout, fetchTimeout, defaultMaxChars,
  and maxRedirects as package-level consts instead of magic numbers.
- Remove misleading "No proxy" comment in NewWebFetchTool.
- Deduplicate NewWebFetchTool by delegating to NewWebFetchToolWithProxy.
Verify that resp.Body is properly closed when the context is canceled
during retry sleep, covering the C8 resp.Body leak fix.
Copilot AI review requested due to automatic review settings March 1, 2026 05:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +103 to +113
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()

req, err := http.NewRequestWithContext(ctx, http.MethodGet, server.URL, nil)
require.NoError(t, err)

resp, err := DoRequestWithRetry(client, req)
require.Error(t, err, "expected error from context cancellation")
assert.Nil(t, resp, "expected nil response when context is canceled")
assert.True(t, bodyClosed, "expected resp.Body to be closed on context cancellation")
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestDoRequestWithRetry_ContextCancel can be flaky: the context timeout (100ms) may expire before the initial request completes, so client.Do can fail immediately (resp=nil) and the test won’t actually exercise the intended “cancel during retry sleep” path. Consider synchronizing cancellation (e.g., block the handler until the first response is sent, then cancel) or using a longer timeout that reliably allows the first request to finish but still cancels before the retry delay elapses.

Copilot uses AI. Check for mistakes.
ItsT0ng added 2 commits March 1, 2026 16:53
Synchronize cancellation using an onRoundTrip callback from the
transport wrapper instead of a timing-based context timeout. This
ensures the first client.Do completes before cancel fires, so
cancellation always hits during sleepWithCtx.
Copilot AI review requested due to automatic review settings March 1, 2026 05:55
@xiaket xiaket merged commit 44a52c0 into sipeed:main Mar 1, 2026
1 check passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return &WebFetchTool{
maxChars: maxChars,
}
// createHTTPClient cannot fail with an empty proxy string.
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in NewWebFetchTool is misleading: passing an empty proxy string to createHTTPClient enables proxying via http.ProxyFromEnvironment (it’s not “no proxy”). Please remove the comment or reword it to reflect the actual behavior (e.g., “use proxy from environment; this call can’t fail when proxy is empty”).

Suggested change
// createHTTPClient cannot fail with an empty proxy string.
// Use proxy configuration from environment; createHTTPClient cannot fail when proxy string is empty.

Copilot uses AI. Check for mistakes.
vvr3ddy added a commit to vvr3ddy/picoclaw that referenced this pull request Mar 1, 2026
Changes from upstream:
- fix(channel): config cleanup and regex precompile (sipeed#911, sipeed#916)
- fix(github_copilot): improve error handling (sipeed#919)
- fix(wecom): context leaks and data race fixes (sipeed#914, sipeed#918)
- fix(tools): HTTP client caching and response body cleanup (sipeed#940)
- feat(tui): Add configurable Launcher and Gateway process management (sipeed#909)
- feat(migrate): Update migration system with openclaw support (sipeed#910)
- fix(skills): Use registry-backed search for skills discovery (sipeed#929)
- build: Add armv6 support to goreleaser (sipeed#905)
- docs: Sync READMEs and channel documentation
@Orgmar
Copy link
Contributor

Orgmar commented Mar 2, 2026

@ItsT0ng Nice catch on the response body leak. Closing req.Body instead of resp.Body during retry cancellation is the kind of subtle bug that silently eats connections over time. Caching the http.Client instances is a good optimization too.

We're building a PicoClaw Dev Group on Discord for contributors to connect and collaborate. If you'd like to join, send an email to support@sipeed.com with the subject [Join PicoClaw Dev Group] ItsT0ng and we'll send you the invite link.

liangzhang-keepmoving pushed a commit to liangzhang-keepmoving/picoclaw that referenced this pull request Mar 2, 2026
…tances (sipeed#940)

* fix(tools): close resp.Body on retry cancel and cache http.Client instances

Fix resp.Body leak in DoRequestWithRetry where req.Body (request) was
incorrectly closed instead of resp.Body (response) on context cancel.
Cache http.Client on web search/fetch provider structs and channel
adapters (WeCom, LINE) to avoid per-call allocation overhead.

* fix(channels): preserve original http client timeouts for LINE and WeCom

Split LINE single 60s client into infoClient (10s) for bot info lookups
and apiClient (30s) for messaging API calls. Lower WeCom cached client
base timeout from 60s to 30s (matching uploadMedia), and ensure it is
always >= the configured ReplyTimeout so the per-request context
deadline remains the effective limit.

* refactor(tools): extract timeout consts and deduplicate WebFetchTool constructors

Address PR review feedback from xiaket:
- Define searchTimeout, perplexityTimeout, fetchTimeout, defaultMaxChars,
  and maxRedirects as package-level consts instead of magic numbers.
- Remove misleading "No proxy" comment in NewWebFetchTool.
- Deduplicate NewWebFetchTool by delegating to NewWebFetchToolWithProxy.

* test(utils): add context cancellation test for DoRequestWithRetry

Verify that resp.Body is properly closed when the context is canceled
during retry sleep, covering the C8 resp.Body leak fix.

* fix(utils): close resp in test to satisfy bodyclose linter

* fix(utils): eliminate flakiness in context cancellation retry test

Synchronize cancellation using an onRoundTrip callback from the
transport wrapper instead of a timing-based context timeout. This
ensures the first client.Do completes before cancel fires, so
cancellation always hits during sleepWithCtx.
hyperwd pushed a commit to hyperwd/picoclaw that referenced this pull request Mar 5, 2026
…tances (sipeed#940)

* fix(tools): close resp.Body on retry cancel and cache http.Client instances

Fix resp.Body leak in DoRequestWithRetry where req.Body (request) was
incorrectly closed instead of resp.Body (response) on context cancel.
Cache http.Client on web search/fetch provider structs and channel
adapters (WeCom, LINE) to avoid per-call allocation overhead.

* fix(channels): preserve original http client timeouts for LINE and WeCom

Split LINE single 60s client into infoClient (10s) for bot info lookups
and apiClient (30s) for messaging API calls. Lower WeCom cached client
base timeout from 60s to 30s (matching uploadMedia), and ensure it is
always >= the configured ReplyTimeout so the per-request context
deadline remains the effective limit.

* refactor(tools): extract timeout consts and deduplicate WebFetchTool constructors

Address PR review feedback from xiaket:
- Define searchTimeout, perplexityTimeout, fetchTimeout, defaultMaxChars,
  and maxRedirects as package-level consts instead of magic numbers.
- Remove misleading "No proxy" comment in NewWebFetchTool.
- Deduplicate NewWebFetchTool by delegating to NewWebFetchToolWithProxy.

* test(utils): add context cancellation test for DoRequestWithRetry

Verify that resp.Body is properly closed when the context is canceled
during retry sleep, covering the C8 resp.Body leak fix.

* fix(utils): close resp in test to satisfy bodyclose linter

* fix(utils): eliminate flakiness in context cancellation retry test

Synchronize cancellation using an onRoundTrip callback from the
transport wrapper instead of a timing-based context timeout. This
ensures the first client.Do completes before cancel fires, so
cancellation always hits during sleepWithCtx.
Pluckypan pushed a commit to Pluckypan/picoclaw that referenced this pull request Mar 6, 2026
…tances (sipeed#940)

* fix(tools): close resp.Body on retry cancel and cache http.Client instances

Fix resp.Body leak in DoRequestWithRetry where req.Body (request) was
incorrectly closed instead of resp.Body (response) on context cancel.
Cache http.Client on web search/fetch provider structs and channel
adapters (WeCom, LINE) to avoid per-call allocation overhead.

* fix(channels): preserve original http client timeouts for LINE and WeCom

Split LINE single 60s client into infoClient (10s) for bot info lookups
and apiClient (30s) for messaging API calls. Lower WeCom cached client
base timeout from 60s to 30s (matching uploadMedia), and ensure it is
always >= the configured ReplyTimeout so the per-request context
deadline remains the effective limit.

* refactor(tools): extract timeout consts and deduplicate WebFetchTool constructors

Address PR review feedback from xiaket:
- Define searchTimeout, perplexityTimeout, fetchTimeout, defaultMaxChars,
  and maxRedirects as package-level consts instead of magic numbers.
- Remove misleading "No proxy" comment in NewWebFetchTool.
- Deduplicate NewWebFetchTool by delegating to NewWebFetchToolWithProxy.

* test(utils): add context cancellation test for DoRequestWithRetry

Verify that resp.Body is properly closed when the context is canceled
during retry sleep, covering the C8 resp.Body leak fix.

* fix(utils): close resp in test to satisfy bodyclose linter

* fix(utils): eliminate flakiness in context cancellation retry test

Synchronize cancellation using an onRoundTrip callback from the
transport wrapper instead of a timing-based context timeout. This
ensures the first client.Do completes before cancel fires, so
cancellation always hits during sleepWithCtx.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants