Skip to content

Djt/0119/realtime#89

Merged
djthorpe merged 9 commits intomainfrom
djt/0119/realtime
Jan 21, 2026
Merged

Djt/0119/realtime#89
djthorpe merged 9 commits intomainfrom
djt/0119/realtime

Conversation

@djthorpe
Copy link
Copy Markdown
Member

This pull request introduces streaming support for transcription and translation, allowing real-time segment delivery over server-sent events (SSE). It also adds configurable audio segmentation options and refactors several APIs for improved flexibility and clarity.

Streaming support and segmenter configuration:

  • Added streaming transcription and translation: Clients can now receive segments in real time using callbacks and SSE, both in the HTTP server and client libraries (pkg/httpclient/transcribe.go, pkg/httpclient/translate.go, pkg/httphandler/transcribe.go, pkg/httphandler/translate.go, cmd/gowhisper/transcribe.go) [1] [2] [3] [4] [5] [6] [7] [8].
  • Introduced audio segmenter options (min-silence-size, max-segment-size) to allow fine-tuning of audio segmentation during processing (cmd/gowhisper/server.go, pkg/manager.go) [1] [2] [3] [4].
  • Refactored manager and option APIs to support new segmenter and streaming options, including new WithSegmenterOpt and WithSegmentCallback functions (pkg/manager.go, pkg/httpclient/opts.go, cmd/gowhisper/server.go) [1] [2] [3] [4].

Server-sent events and schema improvements:

  • Standardized SSE event types for progress, done, and error events in both model download and transcription endpoints, using constants from the schema package (pkg/httphandler/model.go, pkg/httpclient/model.go, pkg/httphandler/transcribe.go) [1] [2] [3] [4].
  • Implemented server-side streaming writers for segments, emitting events as segments are produced (pkg/httphandler/transcribe.go, pkg/httphandler/translate.go) [1] [2].

Other improvements and refactoring:

  • Improved command-line output for streaming transcription, including proper VTT header handling and real-time output (cmd/gowhisper/transcribe.go) [1] [2].
  • Cleaned up option handling and removed the unused WithStream option (pkg/httpclient/opts.go).
  • Updated tests and function signatures to match new manager constructor and APIs (pkg/httphandler/model_test.go, pkg/manager.go) [1] [2].

Copilot AI review requested due to automatic review settings January 19, 2026 16:47
@djthorpe djthorpe self-assigned this Jan 19, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces streaming support for transcription and translation operations, allowing clients to receive segments in real-time via server-sent events (SSE). It also adds configurable audio segmentation options and refactors the API for improved flexibility.

Changes:

  • Adds streaming support with SSE for transcription/translation in both server and client
  • Introduces audio segmenter configuration options (min-silence-size, max-segment-size)
  • Refactors manager constructor and option APIs for consistency

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/whisper/whisper_test.go Updated tests to use instance methods instead of global Close() function
pkg/whisper/whisper.go Removed global Close() convenience function
pkg/whisper/task_test.go Updated test assertion to match new WriteText behavior with leading space
pkg/whisper/task.go Added OTEL tracing and debug logging for segment processing
pkg/whisper/opt.go Added OptTracer option for OpenTelemetry support
pkg/whisper/integration_test.go Updated tests to use instance-based Close() calls
pkg/schema/segment.go Added SegmentWriter interface for real-time segment streaming
pkg/schema/request.go Removed Stream field from TranslateRequest
pkg/opt.go Refactored option functions, added segmenter and whisper options
pkg/manager_test.go Updated test signatures to match new manager constructor
pkg/manager.go Refactored constructor, added streaming callback support
pkg/httphandler/translate.go Implemented SSE streaming for translation endpoint
pkg/httphandler/transcribe.go Implemented SSE streaming for transcription endpoint with streamSegmentWriter
pkg/httphandler/model_test.go Updated test to use new constructor signature
pkg/httphandler/model.go Standardized SSE event types using schema constants
pkg/httpclient/translate.go Added client-side streaming support with segment callbacks
pkg/httpclient/transcribe.go Added client-side streaming support with segment callbacks
pkg/httpclient/opts.go Removed WithStream, added WithSegmentCallback and WithProgressCallback
pkg/httpclient/model.go Updated to use schema constants for SSE event types
cmd/gowhisper/transcribe.go Added real-time segment output with streaming callback
cmd/gowhisper/server.go Added segmenter configuration options to CLI
Comments suppressed due to low confidence (1)

pkg/manager_test.go:283

  • The test name "TestManager_Transcribe_ElevenLabs_StreamUnsupported" is misleading because the Stream field has been removed from TranslateRequest, and the test no longer validates stream parameter rejection. The test should either be removed or renamed to reflect what it actually tests (e.g., a basic transcription call).
func TestManager_Transcribe_ElevenLabs_StreamUnsupported(t *testing.T) {
	tmpDir := t.TempDir()
	manager, err := pkg.New(tmpDir, pkg.OptElevenLabsKey("test-key"))
	if err != nil {
		t.Fatal(err)
	}
	defer manager.Close()

	// Stream should be rejected by ElevenLabs
	req := &schema.TranscribeRequest{
		TranslateRequest: schema.TranslateRequest{
			Model: "scribe_v2",
		},
	}
	_, err = manager.Transcribe(context.Background(), nil, bytes.NewReader([]byte{}), req)
	if err == nil {
		t.Error("expected error for stream parameter in ElevenLabs transcription")
	}
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@djthorpe djthorpe requested a review from Copilot January 21, 2026 08:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@djthorpe djthorpe merged commit 6abfcff into main Jan 21, 2026
@djthorpe djthorpe deleted the djt/0119/realtime branch January 21, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants