[bot] OpenAI: Streaming audio transcription (`audio.transcriptions.create(stream=True)`) not instrumented

## Summary

The OpenAI `audio.transcriptions.create(stream=True)` streaming mode is not instrumented. When called with `stream=True` and models like `gpt-4o-transcribe` or `gpt-4o-mini-transcribe`, the OpenAI SDK returns an SSE stream of `transcript.text.delta` events followed by a `transcript.text.done` event. The current `TranscriptionWrapper` routes through `BaseWrapper.create()`, which has no streaming support — it calls the API, immediately tries to process the response as a single object, and closes the span. No transcript delta events are accumulated, no `time_to_first_token` is measured, and the caller may receive a broken or prematurely consumed stream.

By contrast, `ChatCompletionWrapper.create()` and `ResponseWrapper.create()` both have explicit `if stream:` branches with dedicated stream proxy classes (`_TracedStream` / `_AsyncTracedStream`) that yield chunks to the caller while accumulating them for span logging. The transcription path lacks this entirely.

Non-streaming transcription (`stream=False`, the default) works correctly — #174 added the `TranscriptionWrapper` and it handles that case fine.

## What is missing

| OpenAI Audio Transcription Mode | Instrumented? |
|--------------------------------|---------------|
| `audio.transcriptions.create()` (non-streaming) | **Yes** |
| `audio.transcriptions.create(stream=True)` (streaming) | **No** |
| `audio.translations.create()` | **Yes** |
| `audio.speech.create()` | **Yes** |

### How the gap manifests

`_make_base_wrapper_callback` (line 329 of `tracing.py`) checks `kwargs.get("stream", False)` but only uses it to decide whether to skip `with_raw_response`. It still routes to `BaseWrapper.create()` (line 346), which:

1. Calls the API — gets back an SSE streaming iterator
2. Calls `_try_to_dict(raw_response)` on the iterator — produces garbage or empty dict
3. Calls `self.process_output(log_response, span)` — logs meaningless output
4. Closes the span immediately — no stream consumption tracked
5. Returns the (possibly partially consumed) iterator to the caller

### What should happen

The `TranscriptionWrapper` (or a new streaming-aware subclass) should:

- Detect `stream=True`
- Return a traced stream proxy (like `_TracedStream`) that yields `transcript.text.delta` events to the caller
- Accumulate delta text across chunks
- Measure `time_to_first_token` on the first delta
- Log the final accumulated transcript text as span output when the stream is exhausted
- Close the span only after stream consumption

This matches the existing patterns in `ChatCompletionWrapper.create()` (line 408) and `ResponseWrapper.create()` (line 890).

### Upstream API details

OpenAI's streaming transcription is available in `openai==2.33.0` (the pinned latest in this repo):

- **Models**: `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`
- **Parameter**: `stream=True` on `audio.transcriptions.create()`
- **Response**: SSE stream with `transcript.text.delta` and `transcript.text.done` events
- **Docs**: https://developers.openai.com/api/docs/guides/speech-to-text

## Braintrust docs status

**not_found** — The [OpenAI integration page](https://www.braintrust.dev/docs/providers/openai) does not mention streaming audio transcription.

## Upstream sources

- OpenAI Speech-to-Text guide (streaming section): https://developers.openai.com/api/docs/guides/speech-to-text
- OpenAI Python SDK v2.33.0: `client.audio.transcriptions.create(stream=True)` returns an SSE stream
- OpenAI API reference: https://developers.openai.com/api/reference/resources/audio

## Local files inspected

- `py/src/braintrust/integrations/openai/tracing.py`:
  - `_make_base_wrapper_callback` (line 329) — routes `stream=True` through `BaseWrapper.create()` with no streaming handling
  - `BaseWrapper.create()` (line 1210) — no `if stream:` branch, processes response as single object
  - `TranscriptionWrapper` (line 1411) — inherits from `BaseWrapper`, no streaming override
  - `ChatCompletionWrapper.create()` (line 408) — has proper `if stream:` branch with `_TracedStream` (the pattern to follow)
  - `ResponseWrapper.create()` (line 890) — has proper `if stream:` branch (another pattern to follow)
- `py/src/braintrust/integrations/openai/patchers.py` — `_WrapTranscriptions` patcher targets `Transcriptions.create` and `AsyncTranscriptions.create`
- `py/src/braintrust/integrations/openai/test_openai.py` — no test cases for `audio.transcriptions.create(stream=True)`

## Relationship to existing issues

- #174 (closed) added the non-streaming `TranscriptionWrapper` — this issue is about the missing streaming variant
- #201 (closed) fixed streaming audio delta accumulation in **chat completions** — this issue is the same class of gap but for the **transcription** API

OpenAI Audio Transcription Mode	Instrumented?
`audio.transcriptions.create()` (non-streaming)	Yes
`audio.transcriptions.create(stream=True)` (streaming)	No
`audio.translations.create()`	Yes
`audio.speech.create()`	Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] OpenAI: Streaming audio transcription (`audio.transcriptions.create(stream=True)`) not instrumented #374

Summary

What is missing

How the gap manifests

What should happen

Upstream API details

Braintrust docs status

Upstream sources

Local files inspected

Relationship to existing issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bot] OpenAI: Streaming audio transcription (audio.transcriptions.create(stream=True)) not instrumented #374

Description

Summary

What is missing

How the gap manifests

What should happen

Upstream API details

Braintrust docs status

Upstream sources

Local files inspected

Relationship to existing issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[bot] OpenAI: Streaming audio transcription (`audio.transcriptions.create(stream=True)`) not instrumented #374