Part of epic #3602.
Scope
Implement GonkaProvider covering the non-tools subset of LlmProvider: chat, chat_stream, embed, embed_batch. Tool-calling and structured output land in #10.
Design
GonkaProvider owns:
inner: OpenAiProvider — used only for body construction (debug_request_json) and supports flags (e.g., supports_embeddings).
signer: Arc<RequestSigner>
pool: Arc<EndpointPool>
client: reqwest::Client from crate::http::llm_client()
timeout: Duration
For each call:
- Build body via
inner.debug_request_json(messages, &[], stream).
body_bytes = serde_json::to_vec(&body)?.
- Loop: pick endpoint, sign with fresh
timestamp_ns, POST with raw bytes via RequestBuilder::body(body_bytes.clone()), on retry re-sign with a fresh timestamp.
- Decode response via shared
openai/wire.rs types (extract is in this PR).
- Streaming: reuse
crate::sse::openai_sse_to_stream(response) after the signed POST.
Files to create / modify
crates/zeph-llm/src/openai/wire.rs (new) — extract OpenAiChatResponse, OpenAiUsage, ChatChoice, ChatMessage, EmbeddingResponse, EmbeddingData from openai/mod.rs (currently pub(crate)); make pub(crate) and re-import in both modules.
crates/zeph-llm/src/gonka/mod.rs — GonkaProvider struct, partial LlmProvider impl (chat / chat_stream / embed / embed_batch / name / model_identifier / supports_*).
crates/zeph-llm/src/gonka/tests.rs — wiremock test:
- Records inbound request, verifies
Authorization is base64 of 64 bytes, X-Timestamp is decimal nanoseconds, X-Requester-Address matches the signer.
- Returns canned OpenAI-shaped response; asserts decoded
ChatResponse::Text matches.
- Streaming test: returns SSE; asserts streamed deltas decoded correctly.
insta snapshot of the signed-request body for drift detection.
Await discipline (per .claude/rules/rust-code.md)
- Every external
.await wrapped in tokio::time::timeout(self.timeout, ...).
- Tracing spans:
llm.gonka.request, llm.gonka.sign, llm.gonka.endpoint.next.
debug! before / after each await.
- No locks held across await;
signer is shared via Arc and signing is &self.
Acceptance
cargo nextest run -p zeph-llm -E 'test(gonka)' green.
cargo clippy --workspace --features full -- -D warnings green.
cargo +nightly fmt --check green.
cargo insta test --workspace --features full --check --lib --bins green.
- All
pub items have /// doc comments with # Examples.
CHANGELOG.md [Unreleased] updated.
Depends on
#3607, #3608, #3609, #3610.
Size
L (~8h)
Part of epic #3602.
Scope
Implement
GonkaProvidercovering the non-tools subset ofLlmProvider:chat,chat_stream,embed,embed_batch. Tool-calling and structured output land in #10.Design
GonkaProviderowns:inner: OpenAiProvider— used only for body construction (debug_request_json) and supports flags (e.g.,supports_embeddings).signer: Arc<RequestSigner>pool: Arc<EndpointPool>client: reqwest::Clientfromcrate::http::llm_client()timeout: DurationFor each call:
inner.debug_request_json(messages, &[], stream).body_bytes = serde_json::to_vec(&body)?.timestamp_ns, POST with raw bytes viaRequestBuilder::body(body_bytes.clone()), on retry re-sign with a fresh timestamp.openai/wire.rstypes (extract is in this PR).crate::sse::openai_sse_to_stream(response)after the signed POST.Files to create / modify
crates/zeph-llm/src/openai/wire.rs(new) — extractOpenAiChatResponse,OpenAiUsage,ChatChoice,ChatMessage,EmbeddingResponse,EmbeddingDatafromopenai/mod.rs(currentlypub(crate)); makepub(crate)and re-import in both modules.crates/zeph-llm/src/gonka/mod.rs—GonkaProviderstruct, partialLlmProviderimpl (chat / chat_stream / embed / embed_batch / name / model_identifier / supports_*).crates/zeph-llm/src/gonka/tests.rs— wiremock test:Authorizationis base64 of 64 bytes,X-Timestampis decimal nanoseconds,X-Requester-Addressmatches the signer.ChatResponse::Textmatches.instasnapshot of the signed-request body for drift detection.Await discipline (per .claude/rules/rust-code.md)
.awaitwrapped intokio::time::timeout(self.timeout, ...).llm.gonka.request,llm.gonka.sign,llm.gonka.endpoint.next.debug!before / after each await.signeris shared viaArcand signing is&self.Acceptance
cargo nextest run -p zeph-llm -E 'test(gonka)'green.cargo clippy --workspace --features full -- -D warningsgreen.cargo +nightly fmt --checkgreen.cargo insta test --workspace --features full --check --lib --binsgreen.pubitems have///doc comments with# Examples.CHANGELOG.md[Unreleased]updated.Depends on
#3607, #3608, #3609, #3610.
Size
L (~8h)