feat(transformer): add cross-provider search citation and annotation support#1643
Conversation
Preserve unified web_search tools when Anthropic-style requests are routed through the OpenAI Responses outbound transformer, including the documented filters schema and request coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep citation and web search metadata intact across OpenAI Responses, Anthropic, and Gemini transforms so streamed and non-streamed outputs stay semantically consistent end to end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restore the current branch signatures in the Responses, Anthropic, and Gemini citation paths after rebase conflicts mixed in incompatible scope-aware variants. Update orchestrator citation tests to initialize the channel limiter manager required by the current middleware stack. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
#1313 |
There was a problem hiding this comment.
Code Review
This pull request introduces support for web search citations and grounding metadata across multiple LLM providers (Anthropic, Gemini, and OpenAI). It updates the unified LLM schema to include citation annotations (StartIndex, EndIndex) and adds logic to preserve these annotations during streaming and non-streaming transformations. Additionally, it includes comprehensive integration tests to validate citation round-trips and grounding metadata handling. I have kept the review comment regarding indentation, as it points to a readability issue in the aggregator logic.
| if contentBlocks[index].Type == "text" { | ||
| contentBlocks[index].Citations = append(contentBlocks[index].Citations, *event.Delta.Citation) | ||
| } | ||
| } | ||
|
|
||
| if event.Delta.Thinking != nil { | ||
| if contentBlocks[index].Type == "thinking" { | ||
| if contentBlocks[index].Thinking == nil { | ||
| contentBlocks[index].Thinking = lo.ToPtr("") |
Greptile SummaryThis PR adds end-to-end preservation of search citations and annotations across all four transformer layers (OpenAI Responses, OpenAI Chat, Anthropic, and Gemini), enabling citation round-trips when a request passes through multiple provider adapters.
Confidence Score: 5/5Safe to merge — the core citation round-trip logic is well-tested across all four transformer layers with no correctness regressions found. The streaming metadata fallback path in No files require special attention beyond the minor comment wording in Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant ResponsesOutbound as Responses Outbound (OAI→LLM)
participant LLMResponse as llm.Response
participant AnthropicInbound as Anthropic Inbound (LLM→Anthropic)
participant GeminiInbound as Gemini Inbound (LLM→Gemini)
Note over ResponsesOutbound: web_search_call OutputItemDone
ResponsesOutbound->>LLMResponse: TransformerMetadata["openai_responses_web_search_calls"]
Note over ResponsesOutbound: message OutputItemDone (with annotations)
ResponsesOutbound->>LLMResponse: Choices[0].Delta.Annotations + TransformerMetadata emitted
Note over ResponsesOutbound: response.completed (fallback)
ResponsesOutbound->>LLMResponse: TransformerMetadata emitted if not yet sent
LLMResponse->>AnthropicInbound: message.Annotations + TransformerMetadata
AnthropicInbound->>AnthropicInbound: "citationFromLLMAnnotation()<br/>(url_citation to web_search_result_location if OAI web search metadata present)"
AnthropicInbound->>AnthropicInbound: "mergeAnthropicResponseContentBlocks()<br/>(restore native provider blocks from metadata)"
AnthropicInbound-->>Client: Anthropic Message with citations
LLMResponse->>GeminiInbound: message.Annotations
GeminiInbound->>GeminiInbound: citationMetadataFromLLMAnnotations()
GeminiInbound-->>Client: Gemini GenerateContentResponse with CitationMetadata
Reviews (3): Last reviewed commit: "fix: preserve responses web search calls..." | Re-trigger Greptile |
Refine Anthropic stream stop handling and align test/docs formatting with CI and review feedback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep web_search_call metadata on an existing outbound chunk and restore it into the final responses stream output so annotation-free search responses round-trip correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
这个牛逼。 |
Summary
Test plan
go -C "/Users/bytedance/Dev/axonhub/.claude/worktrees/openai-responses-web-search/llm" test ./transformer/openai/responses -run 'Integration|Stream' -count=1go -C "/Users/bytedance/Dev/axonhub/.claude/worktrees/openai-responses-web-search/llm" test ./transformer/openai ./transformer/anthropic ./transformer/gemini -run 'Integration|Stream' -count=1go -C "/Users/bytedance/Dev/axonhub/.claude/worktrees/openai-responses-web-search" test ./internal/server/orchestrator ./internal/server/api -count=1🤖 Generated with Claude Code