fix: better session management for github_copilot_provider#642
fix: better session management for github_copilot_provider#642yinwm merged 3 commits intosipeed:mainfrom
github_copilot_provider#642Conversation
| p.mu.Lock() | ||
| defer p.mu.Unlock() |
There was a problem hiding this comment.
Would it maybe be better if you lock only if the client is not nil?
Just to be sure to locking only if there's a real reason.
nikolasdehor
left a comment
There was a problem hiding this comment.
Good fix. The old code called defer client.Stop() immediately after client.Start(), which disconnected the gRPC client before any messages could be sent. The session was effectively dead on arrival.
The fix correctly:
- Stores the client alongside the session for lifecycle management
- Adds a
Close()method called during gateway shutdown - Adds mutex protection around
Chat()andClose()for concurrency safety - Handles error from
CreateSession(previously ignored) - Handles error from
json.Marshal(previously ignored) - Uses
SendAndWaitinstead ofSendfor synchronous responses
The SessionProvider interface is a clean extension of LLMProvider that only providers needing explicit cleanup implement. The interface{} to any cleanup in the test file is a nice bonus.
One minor note: p.session.SendAndWait is called while holding p.mu, which is correct for single-session safety. If the session itself is not goroutine-safe, this is necessary.
LGTM.
yinwm
left a comment
There was a problem hiding this comment.
Overview
Thank you for this PR! You correctly identified and fixed the critical bug with defer client.Stop(). The change from Send() to SendAndWait() also fixes the original API misuse where the response was being discarded.
However, there are a few issues that should be addressed before merging:
🔴 Critical Issues
1. Lock granularity blocks concurrent I/O operations
File: pkg/providers/github_copilot_provider.go:95
The current implementation holds the mutex during the entire SendAndWait call:
p.mu.Lock()
defer p.mu.Unlock()
resp, err := p.session.SendAndWait(ctx, copilot.MessageOptions{...})SendAndWait can block for up to 60 seconds (default timeout). Holding the lock during I/O will cause all concurrent requests to block each other, effectively serializing what could be parallel operations.
According to the SDK documentation:
"All methods are safe for concurrent use."
The session itself is thread-safe, so no additional locking is needed for concurrent calls.
Suggested fix:
p.mu.Lock()
session := p.session
p.mu.Unlock()
if session == nil {
return nil, fmt.Errorf("provider closed")
}
resp, err := session.SendAndWait(ctx, copilot.MessageOptions{
Prompt: string(fullcontent),
})This way, the lock only protects reading the shared pointer, not the I/O operation.
2. Empty response is silently treated as success
File: pkg/providers/github_copilot_provider.go:107
var content string
if resp != nil && resp.Data.Content != nil {
content = *resp.Data.Content
}If copilot returns an empty response, the code returns LLMResponse{Content: ""}. Callers cannot distinguish between "empty content" and "something went wrong".
Suggested fix:
if resp == nil {
return nil, fmt.Errorf("empty response from copilot")
}
if resp.Data.Content == nil {
return nil, fmt.Errorf("no content in copilot response")
}
content := *resp.Data.Content3. Shutdown order may cause race conditions
File: cmd/picoclaw/cmd_gateway.go:215
cancel()
if cp, ok := provider.(providers.SessionProvider); ok {
cp.Close()
}Calling cancel() first will interrupt any in-flight SendAndWait calls with "context canceled" error. A better user experience would be to close the provider gracefully first.
Suggested fix:
if cp, ok := provider.(providers.SessionProvider); ok {
cp.Close()
}
cancel()🟡 Suggestions
4. Interface naming
File: pkg/providers/types.go:33
SessionProvider suggests "a provider that has a session", but the actual semantics is "a provider that needs explicit cleanup". Consider naming it CloseableProvider or StatefulProvider for clarity.
5. Test coverage
This PR touches critical concurrency and resource management code but adds no tests. Consider adding:
TestNewGitHubCopilotProvider_CreateSessionError— verify client is cleaned up on errorTestClose— verify behavior after Close()TestChat_ConcurrentSafety— verify concurrent calls work correctly
6. Unrelated changes in registry_test.go
The map[string]interface{} → map[string]any changes are good style improvements but unrelated to session management. Consider splitting into a separate PR to keep the focus clear.
Summary
The core fix is correct and important. Addressing the lock granularity and empty response handling issues will make this a solid improvement.
Address all feedback from PR review: - Lock granularity - Empty response handling - Shutdown race condition - Interface naming
|
@yinwm
I have tested the changes locally and everything works fine. Regarding unit tests, I apologize that I haven't added them yet due to my limited experience with testing. This is an area I need to improve. If you could provide some guidance or examples, I'd be happy to learn and add the tests. Thank you for your understanding! |
fix: better session management for `github_copilot_provider`
📝 Description
🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
📚 Technical Context (Skip for Docs)
🧪 Test Environment
📸 Evidence (Optional)
Click to view Logs/Screenshots
☑️ Checklist