Skip to content

[grpc] Refactor grpc/regular/responses#16509

Merged
slin1237 merged 1 commit intomainfrom
chang/resp-refactor-3
Jan 5, 2026
Merged

[grpc] Refactor grpc/regular/responses#16509
slin1237 merged 1 commit intomainfrom
chang/resp-refactor-3

Conversation

@CatherineSue
Copy link
Copy Markdown
Collaborator

@CatherineSue CatherineSue commented Jan 5, 2026

Motivation

The current state of regular/responses module has a misleading structure.

Current State (2592 lines, 6 files)

regular/responses/
├── mod.rs          (20 lines)   # Module exports
├── context.rs      (70 lines)   # ResponsesContext
├── types.rs        (18 lines)   # Just BackgroundTaskInfo ← merge into context.rs
├── conversions.rs  (428 lines)  # responses_to_chat, chat_to_responses
├── handlers.rs     (812 lines)  # Entry points + non-MCP execution + load_conversation_history
└── tool_loop.rs    (1244 lines) # ALL MCP tool loop logic (too big!)

Modifications

Proposed Structure (mirroring harmony pattern)

regular/responses/
├── mod.rs            (~30 lines)  # Module exports
├── context.rs        (~90 lines)  # ResponsesContext + BackgroundTaskInfo (merged)
├── conversions.rs    (428 lines)  # Unchanged
├── common.rs         (~250 lines) # NEW: Shared helpers
│                                  #   - ToolLoopState (from tool_loop.rs)
│                                  #   - prepare_chat_tools_and_choice
│                                  #   - extract_all_tool_calls_from_chat
│                                  #   - convert_mcp_tools_to_chat_tools
│                                  #   - build_mcp_list_tools_item
│                                  #   - build_mcp_call_item
│                                  #   - generate_mcp_id
│                                  #   - load_conversation_history (from handlers.rs)
├── handlers.rs       (~100 lines) # THIN: Entry points only
│                                  #   - route_responses (dispatches to sync/streaming)
│                                  #   - route_responses_sync
│                                  #   - route_responses_streaming
├── non_streaming.rs  (~400 lines) # NEW: Non-streaming execution
│                                  #   - route_responses_internal (from handlers.rs)
│                                  #   - execute_tool_loop (from tool_loop.rs)
│                                  #   - execute_without_mcp (from handlers.rs)
└── streaming.rs      (~700 lines) # NEW: Streaming execution
                                   #   - execute_tool_loop_streaming (from tool_loop.rs)
                                   #   - execute_tool_loop_streaming_internal
                                   #   - convert_and_accumulate_stream
                                   #   - ChatResponseAccumulator
                                   #   - convert_chat_stream_to_responses_stream (from handlers.rs)
                                   #   - process_and_transform_sse_stream (from handlers.rs)
                                   #   - StreamingResponseAccumulator (from handlers.rs)

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments (/tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci) or contact authorized users to do so.
  4. After green CI and required approvals, ask Merge Oncalls to merge.

Current State (2592 lines, 6 files)

regular/responses/
├── mod.rs          (20 lines)   # Module exports
├── context.rs      (70 lines)   # ResponsesContext
├── types.rs        (18 lines)   # Just BackgroundTaskInfo ← merge into context.rs
├── conversions.rs  (428 lines)  # responses_to_chat, chat_to_responses
├── handlers.rs     (812 lines)  # Entry points + non-MCP execution + load_conversation_history
└── tool_loop.rs    (1244 lines) # ALL MCP tool loop logic (too big!)

Proposed Structure (mirroring harmony pattern)

regular/responses/
├── mod.rs            (~30 lines)  # Module exports
├── context.rs        (~90 lines)  # ResponsesContext + BackgroundTaskInfo (merged)
├── conversions.rs    (428 lines)  # Unchanged
├── common.rs         (~250 lines) # NEW: Shared helpers
│                                  #   - ToolLoopState (from tool_loop.rs)
│                                  #   - prepare_chat_tools_and_choice
│                                  #   - extract_all_tool_calls_from_chat
│                                  #   - convert_mcp_tools_to_chat_tools
│                                  #   - build_mcp_list_tools_item
│                                  #   - build_mcp_call_item
│                                  #   - generate_mcp_id
│                                  #   - load_conversation_history (from handlers.rs)
├── handlers.rs       (~100 lines) # THIN: Entry points only
│                                  #   - route_responses (dispatches to sync/streaming)
│                                  #   - route_responses_sync
│                                  #   - route_responses_streaming
├── non_streaming.rs  (~400 lines) # NEW: Non-streaming execution
│                                  #   - route_responses_internal (from handlers.rs)
│                                  #   - execute_tool_loop (from tool_loop.rs)
│                                  #   - execute_without_mcp (from handlers.rs)
└── streaming.rs      (~700 lines) # NEW: Streaming execution
                                   #   - execute_tool_loop_streaming (from tool_loop.rs)
                                   #   - execute_tool_loop_streaming_internal
                                   #   - convert_and_accumulate_stream
                                   #   - ChatResponseAccumulator
                                   #   - convert_chat_stream_to_responses_stream (from handlers.rs)
                                   #   - process_and_transform_sse_stream (from handlers.rs)
                                   #   - StreamingResponseAccumulator (from handlers.rs)
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@slin1237 slin1237 merged commit 5a2b1ed into main Jan 5, 2026
72 checks passed
@slin1237 slin1237 deleted the chang/resp-refactor-3 branch January 5, 2026 18:21
jamesjxliu pushed a commit to jamesjxliu/sglang that referenced this pull request Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants