Skip to content

fix: better session management for github_copilot_provider#642

Merged
yinwm merged 3 commits intosipeed:mainfrom
Lixeer:main
Feb 24, 2026
Merged

fix: better session management for github_copilot_provider#642
yinwm merged 3 commits intosipeed:mainfrom
Lixeer:main

Conversation

@Lixeer
Copy link
Contributor

@Lixeer Lixeer commented Feb 22, 2026

📝 Description

🗣️ Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 📖 Documentation update
  • ⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

  • 🤖 Fully AI-generated (100% AI, 0% Human)
  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
  • 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

📚 Technical Context (Skip for Docs)

🧪 Test Environment

  • Hardware: Intel/AMD PC platform
  • OS: Debian,Window
  • Model/Provider: gpt-4.1
  • Channels:

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

Comment on lines +64 to +65
p.mu.Lock()
defer p.mu.Unlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe be better if you lock only if the client is not nil?
Just to be sure to locking only if there's a real reason.

Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix. The old code called defer client.Stop() immediately after client.Start(), which disconnected the gRPC client before any messages could be sent. The session was effectively dead on arrival.

The fix correctly:

  1. Stores the client alongside the session for lifecycle management
  2. Adds a Close() method called during gateway shutdown
  3. Adds mutex protection around Chat() and Close() for concurrency safety
  4. Handles error from CreateSession (previously ignored)
  5. Handles error from json.Marshal (previously ignored)
  6. Uses SendAndWait instead of Send for synchronous responses

The SessionProvider interface is a clean extension of LLMProvider that only providers needing explicit cleanup implement. The interface{} to any cleanup in the test file is a nice bonus.

One minor note: p.session.SendAndWait is called while holding p.mu, which is correct for single-session safety. If the session itself is not goroutine-safe, this is necessary.

LGTM.

Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

Thank you for this PR! You correctly identified and fixed the critical bug with defer client.Stop(). The change from Send() to SendAndWait() also fixes the original API misuse where the response was being discarded.

However, there are a few issues that should be addressed before merging:


🔴 Critical Issues

1. Lock granularity blocks concurrent I/O operations

File: pkg/providers/github_copilot_provider.go:95

The current implementation holds the mutex during the entire SendAndWait call:

p.mu.Lock()
defer p.mu.Unlock()

resp, err := p.session.SendAndWait(ctx, copilot.MessageOptions{...})

SendAndWait can block for up to 60 seconds (default timeout). Holding the lock during I/O will cause all concurrent requests to block each other, effectively serializing what could be parallel operations.

According to the SDK documentation:

"All methods are safe for concurrent use."

The session itself is thread-safe, so no additional locking is needed for concurrent calls.

Suggested fix:

p.mu.Lock()
session := p.session
p.mu.Unlock()

if session == nil {
    return nil, fmt.Errorf("provider closed")
}

resp, err := session.SendAndWait(ctx, copilot.MessageOptions{
    Prompt: string(fullcontent),
})

This way, the lock only protects reading the shared pointer, not the I/O operation.


2. Empty response is silently treated as success

File: pkg/providers/github_copilot_provider.go:107

var content string
if resp != nil && resp.Data.Content != nil {
    content = *resp.Data.Content
}

If copilot returns an empty response, the code returns LLMResponse{Content: ""}. Callers cannot distinguish between "empty content" and "something went wrong".

Suggested fix:

if resp == nil {
    return nil, fmt.Errorf("empty response from copilot")
}
if resp.Data.Content == nil {
    return nil, fmt.Errorf("no content in copilot response")
}
content := *resp.Data.Content

3. Shutdown order may cause race conditions

File: cmd/picoclaw/cmd_gateway.go:215

cancel()
if cp, ok := provider.(providers.SessionProvider); ok {
    cp.Close()
}

Calling cancel() first will interrupt any in-flight SendAndWait calls with "context canceled" error. A better user experience would be to close the provider gracefully first.

Suggested fix:

if cp, ok := provider.(providers.SessionProvider); ok {
    cp.Close()
}
cancel()

🟡 Suggestions

4. Interface naming

File: pkg/providers/types.go:33

SessionProvider suggests "a provider that has a session", but the actual semantics is "a provider that needs explicit cleanup". Consider naming it CloseableProvider or StatefulProvider for clarity.

5. Test coverage

This PR touches critical concurrency and resource management code but adds no tests. Consider adding:

  • TestNewGitHubCopilotProvider_CreateSessionError — verify client is cleaned up on error
  • TestClose — verify behavior after Close()
  • TestChat_ConcurrentSafety — verify concurrent calls work correctly

6. Unrelated changes in registry_test.go

The map[string]interface{}map[string]any changes are good style improvements but unrelated to session management. Consider splitting into a separate PR to keep the focus clear.


Summary

The core fix is correct and important. Addressing the lock granularity and empty response handling issues will make this a solid improvement.

Address all feedback from PR review:
- Lock granularity
- Empty response handling
- Shutdown race condition
- Interface naming
@Lixeer
Copy link
Contributor Author

Lixeer commented Feb 24, 2026

@yinwm
Thank you very much for your detailed review! I have fixed the code following your suggestions:

  1. ✅ Optimized lock granularity - now only protects reading the shared pointer
  2. ✅ Added error handling for empty responses
  3. ✅ Fixed shutdown order - close provider first, then call cancel
  4. ✅ Renamed SessionProvider to StatefulProvider

I have tested the changes locally and everything works fine.

Regarding unit tests, I apologize that I haven't added them yet due to my limited experience with testing. This is an area I need to improve. If you could provide some guidance or examples, I'd be happy to learn and add the tests. Thank you for your understanding!

Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! All previously raised issues have been addressed:

  • ✅ Lock granularity optimized
  • ✅ Empty response handling added
  • ✅ Shutdown order fixed
  • ✅ Interface naming improved

The code quality is solid and ready to merge.

@yinwm yinwm merged commit eb138a3 into sipeed:main Feb 24, 2026
2 checks passed
hyperwd pushed a commit to hyperwd/picoclaw that referenced this pull request Mar 5, 2026
fix: better session management for `github_copilot_provider`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants