Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces streaming support for transcription and translation operations, allowing clients to receive segments in real-time via server-sent events (SSE). It also adds configurable audio segmentation options and refactors the API for improved flexibility.
Changes:
- Adds streaming support with SSE for transcription/translation in both server and client
- Introduces audio segmenter configuration options (min-silence-size, max-segment-size)
- Refactors manager constructor and option APIs for consistency
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/whisper/whisper_test.go | Updated tests to use instance methods instead of global Close() function |
| pkg/whisper/whisper.go | Removed global Close() convenience function |
| pkg/whisper/task_test.go | Updated test assertion to match new WriteText behavior with leading space |
| pkg/whisper/task.go | Added OTEL tracing and debug logging for segment processing |
| pkg/whisper/opt.go | Added OptTracer option for OpenTelemetry support |
| pkg/whisper/integration_test.go | Updated tests to use instance-based Close() calls |
| pkg/schema/segment.go | Added SegmentWriter interface for real-time segment streaming |
| pkg/schema/request.go | Removed Stream field from TranslateRequest |
| pkg/opt.go | Refactored option functions, added segmenter and whisper options |
| pkg/manager_test.go | Updated test signatures to match new manager constructor |
| pkg/manager.go | Refactored constructor, added streaming callback support |
| pkg/httphandler/translate.go | Implemented SSE streaming for translation endpoint |
| pkg/httphandler/transcribe.go | Implemented SSE streaming for transcription endpoint with streamSegmentWriter |
| pkg/httphandler/model_test.go | Updated test to use new constructor signature |
| pkg/httphandler/model.go | Standardized SSE event types using schema constants |
| pkg/httpclient/translate.go | Added client-side streaming support with segment callbacks |
| pkg/httpclient/transcribe.go | Added client-side streaming support with segment callbacks |
| pkg/httpclient/opts.go | Removed WithStream, added WithSegmentCallback and WithProgressCallback |
| pkg/httpclient/model.go | Updated to use schema constants for SSE event types |
| cmd/gowhisper/transcribe.go | Added real-time segment output with streaming callback |
| cmd/gowhisper/server.go | Added segmenter configuration options to CLI |
Comments suppressed due to low confidence (1)
pkg/manager_test.go:283
- The test name "TestManager_Transcribe_ElevenLabs_StreamUnsupported" is misleading because the Stream field has been removed from TranslateRequest, and the test no longer validates stream parameter rejection. The test should either be removed or renamed to reflect what it actually tests (e.g., a basic transcription call).
func TestManager_Transcribe_ElevenLabs_StreamUnsupported(t *testing.T) {
tmpDir := t.TempDir()
manager, err := pkg.New(tmpDir, pkg.OptElevenLabsKey("test-key"))
if err != nil {
t.Fatal(err)
}
defer manager.Close()
// Stream should be rejected by ElevenLabs
req := &schema.TranscribeRequest{
TranslateRequest: schema.TranslateRequest{
Model: "scribe_v2",
},
}
_, err = manager.Transcribe(context.Background(), nil, bytes.NewReader([]byte{}), req)
if err == nil {
t.Error("expected error for stream parameter in ElevenLabs transcription")
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 34 out of 34 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This pull request introduces streaming support for transcription and translation, allowing real-time segment delivery over server-sent events (SSE). It also adds configurable audio segmentation options and refactors several APIs for improved flexibility and clarity.
Streaming support and segmenter configuration:
pkg/httpclient/transcribe.go,pkg/httpclient/translate.go,pkg/httphandler/transcribe.go,pkg/httphandler/translate.go,cmd/gowhisper/transcribe.go) [1] [2] [3] [4] [5] [6] [7] [8].min-silence-size,max-segment-size) to allow fine-tuning of audio segmentation during processing (cmd/gowhisper/server.go,pkg/manager.go) [1] [2] [3] [4].WithSegmenterOptandWithSegmentCallbackfunctions (pkg/manager.go,pkg/httpclient/opts.go,cmd/gowhisper/server.go) [1] [2] [3] [4].Server-sent events and schema improvements:
pkg/httphandler/model.go,pkg/httpclient/model.go,pkg/httphandler/transcribe.go) [1] [2] [3] [4].pkg/httphandler/transcribe.go,pkg/httphandler/translate.go) [1] [2].Other improvements and refactoring:
cmd/gowhisper/transcribe.go) [1] [2].WithStreamoption (pkg/httpclient/opts.go).pkg/httphandler/model_test.go,pkg/manager.go) [1] [2].