Skip to content

refactor: Go SDK usage improvements from module review#2967

Merged
lpcox merged 1 commit intomainfrom
fix/go-sdk-improvements-2911
Apr 1, 2026
Merged

refactor: Go SDK usage improvements from module review#2967
lpcox merged 1 commit intomainfrom
fix/go-sdk-improvements-2911

Conversation

@lpcox
Copy link
Copy Markdown
Collaborator

@lpcox lpcox commented Apr 1, 2026

Summary

Addresses the quick-win and best-practice findings from the Go Fan module review of modelcontextprotocol/go-sdk (#2911).

Changes

1. Extract pagination helper (connection.go)

Introduces a generic paginateAll[T]() helper that replaces three identical cursor-loop pagination implementations in listTools, listResources, and listPrompts. Eliminates ~45 lines of duplicated boilerplate while preserving identical logging behavior.

2. Eliminate resourceContents intermediate type (tool_result.go)

The local resourceContents struct mirrored sdk.ResourceContents field-for-field for JSON unmarshaling. Since the SDK type works directly for this purpose, the local type is removed and sdk.ResourceContents is used inline. The field-by-field copy in the resource content conversion is also eliminated.

3. Explicit server options (mcptest/server.go)

Passes &sdk.ServerOptions{} instead of nil to sdk.NewServer() — more defensive and documents intent, guarding against future SDK changes.

4. Cache eviction for filteredServerCache (routed.go)

The routed-mode server cache previously grew unboundedly (one entry per backend×session pair, never evicted). Now includes:

  • TTL-based expiry matching the SDK SessionTimeout (30 minutes)
  • Lazy eviction on each getOrCreate() call
  • Extracted routedSessionTimeout variable shared between cache TTL and SDK options

5. Transport ownership documentation (http_transport.go)

Adds lifecycle/ownership documentation to the transportConnector type, clarifying that the SDK session owns the returned transport after Connect().

Testing

  • All Go unit and integration tests pass
  • make agent-finished passes (format, build, lint, all tests)

Closes #2911

Address findings from Go Fan report on modelcontextprotocol/go-sdk:

1. Extract generic paginateAll() helper in connection.go to deduplicate
   identical cursor-loop pagination across listTools, listResources,
   and listPrompts (~45 lines of boilerplate removed).

2. Eliminate resourceContents intermediate type in tool_result.go by
   using sdk.ResourceContents directly for JSON unmarshaling, removing
   field-by-field copy in the resource content conversion.

3. Pass explicit &sdk.ServerOptions{} instead of nil in mcptest/server.go
   to guard against future SDK changes that might not accept nil options.

4. Add TTL-based eviction to filteredServerCache in routed.go to prevent
   unbounded memory growth. Cache entries now expire after the session
   timeout (30min), evicted lazily on each getOrCreate call.

5. Add transport ownership documentation to transportConnector type
   clarifying that the SDK session owns the transport after Connect().

Closes #2911

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 1, 2026 00:32
@lpcox lpcox merged commit 1631b99 into main Apr 1, 2026
24 checks passed
@lpcox lpcox deleted the fix/go-sdk-improvements-2911 branch April 1, 2026 00:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors several Go SDK integration points to reduce duplication and improve operational safety, based on findings from the modelcontextprotocol/go-sdk module review.

Changes:

  • Introduces a generic pagination helper in the MCP connection layer to remove duplicated cursor-loop logic.
  • Removes a redundant local resourceContents type in favor of unmarshaling directly into sdk.ResourceContents.
  • Adds TTL-based eviction to the routed-mode (backend, session) filtered server cache and makes server options explicit in the MCP test server.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/mcp/connection.go Extracts generic pagination helper and refactors listTools/listResources/listPrompts to use it.
internal/mcp/tool_result.go Simplifies embedded resource unmarshaling by using sdk.ResourceContents directly.
internal/mcp/http_transport.go Documents transport lifecycle/ownership expectations for SDK transports created by connectors.
internal/server/routed.go Adds TTL tracking + lazy eviction for routed filtered server cache; aligns TTL with SessionTimeout.
internal/testutil/mcptest/server.go Passes explicit &sdk.ServerOptions{} to sdk.NewServer() in tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +555 to +570
all := make([]T, len(first.Items), max(len(first.Items), 1))
copy(all, first.Items)
logConn.Printf("list%s: received page of %d %s from serverID=%s", itemKind, len(first.Items), itemKind, serverID)

cursor := first.NextCursor
for cursor != "" {
result, err := c.getSDKSession().ListTools(c.ctx, &sdk.ListToolsParams{Cursor: cursor})
page, err := fetch(cursor)
if err != nil {
return nil, err
}
allTools = append(allTools, result.Tools...)
logConn.Printf("listTools: received page of %d tools (total so far: %d) from serverID=%s", len(result.Tools), len(allTools), c.serverID)
cursor = result.NextCursor
all = append(all, page.Items...)
logConn.Printf("list%s: received page of %d %s (total so far: %d) from serverID=%s", itemKind, len(page.Items), itemKind, len(all), serverID)
cursor = page.NextCursor
}
logConn.Printf("listTools: received %d tools total from serverID=%s", len(allTools), c.serverID)
return marshalToResponse(&sdk.ListToolsResult{Tools: allTools})
logConn.Printf("list%s: received %d %s total from serverID=%s", itemKind, len(all), itemKind, serverID)
return all, nil
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paginateAll() changes the log prefix from the prior per-method prefixes (e.g., "listTools:"/"listResources:") to "list%s:" with a lowercase plural kind (e.g., "listtools:"). This contradicts the PR description of preserving identical logging behavior and can break log filtering/alerts keyed on the existing prefixes. Consider passing an explicit log prefix ("listTools"/"listResources"/"listPrompts") into paginateAll, or logging through a callback supplied by the caller so the emitted messages remain unchanged.

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +79
for k, entry := range c.servers {
if now.Sub(entry.lastUsed) > c.ttl {
logRouted.Printf("[CACHE] Evicting expired server: key=%s (idle %s)", k, now.Sub(entry.lastUsed).Round(time.Second))
delete(c.servers, k)
}
}

if entry, ok := c.servers[key]; ok {
entry.lastUsed = now
return entry.server
}

logRouted.Printf("[CACHE] Creating new filtered server: backend=%s, session=%s", backendID, sessionID)
server := creator()
c.servers[key] = &filteredServerEntry{server: server, lastUsed: now}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filteredServerCache logs raw session identifiers ("session=%s" and the full "key=%s" which includes sessionID). In this codebase session IDs are API keys and should be truncated before logging to avoid secret leakage (see internal/auth/header.go TruncateSessionID and existing usage in internal/server/session.go). Update these log lines to use auth.TruncateSessionID(sessionID) (and avoid embedding the full sessionID inside the logged key).

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +70
c.mu.Lock()
defer c.mu.Unlock()

// Lazy eviction of expired entries
for k, entry := range c.servers {
if now.Sub(entry.lastUsed) > c.ttl {
logRouted.Printf("[CACHE] Evicting expired server: key=%s (idle %s)", k, now.Sub(entry.lastUsed).Round(time.Second))
delete(c.servers, k)
}
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrCreate() now scans and potentially logs/evicts every cache entry on every call (the for k, entry := range c.servers loop). Since this path runs per request, the O(n) eviction can become a CPU and lock-contention hotspot as the cache grows. Consider reducing eviction frequency (e.g., only evict when now passes a nextEvictAt, evict a bounded number of entries per call, or run periodic cleanup in a background goroutine).

Copilot uses AI. Check for mistakes.
servers map[string]*sdk.Server
servers map[string]*filteredServerEntry
ttl time.Duration
mu sync.RWMutex
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filteredServerCache declares mu sync.RWMutex but getOrCreate() always takes the write lock (Lock) and never uses RLock/RUnlock. If there’s no read-only fast path, consider switching this to sync.Mutex to better reflect usage and avoid confusion about intended concurrency behavior.

Suggested change
mu sync.RWMutex
mu sync.Mutex

Copilot uses AI. Check for mistakes.
Comment on lines +55 to 80
// getOrCreate returns a cached server or creates a new one.
// Expired entries are lazily evicted on each call.
func (c *filteredServerCache) getOrCreate(backendID, sessionID string, creator func() *sdk.Server) *sdk.Server {
key := fmt.Sprintf("%s/%s", backendID, sessionID)
now := time.Now()

server, _ := syncutil.GetOrCreate(&c.mu, c.servers, key, func() (*sdk.Server, error) {
logRouted.Printf("[CACHE] Creating new filtered server: backend=%s, session=%s", backendID, sessionID)
return creator(), nil
})
c.mu.Lock()
defer c.mu.Unlock()

// Lazy eviction of expired entries
for k, entry := range c.servers {
if now.Sub(entry.lastUsed) > c.ttl {
logRouted.Printf("[CACHE] Evicting expired server: key=%s (idle %s)", k, now.Sub(entry.lastUsed).Round(time.Second))
delete(c.servers, k)
}
}

if entry, ok := c.servers[key]; ok {
entry.lastUsed = now
return entry.server
}

logRouted.Printf("[CACHE] Creating new filtered server: backend=%s, session=%s", backendID, sessionID)
server := creator()
c.servers[key] = &filteredServerEntry{server: server, lastUsed: now}
return server
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new TTL-based eviction behavior in filteredServerCache (lastUsed tracking + lazy eviction) isn’t covered by tests. Since internal/server has routed_test.go, it would be good to add a focused unit test for getOrCreate() verifying that entries older than ttl are evicted and recreated, while recently-used entries are retained (using a controllable clock or very short ttl).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[go-fan] Go Module Review: modelcontextprotocol/go-sdk

2 participants