Skip to content

[model-gateway] Tighten visibility across data_connector and grpc module#16516

Merged
slin1237 merged 8 commits intomainfrom
chang/cleanup
Jan 5, 2026
Merged

[model-gateway] Tighten visibility across data_connector and grpc module#16516
slin1237 merged 8 commits intomainfrom
chang/cleanup

Conversation

@CatherineSue
Copy link
Copy Markdown
Collaborator

@CatherineSue CatherineSue commented Jan 5, 2026

Motivation

This PR tightens the visibility of structs and functions across the source files in sgl-model-gateway, mainly data_connector and grpc module.

As a result, some dead code are appearing (as clippy masks dead code when it is pub), the unused methods are removed.

Modifications

  • Tighten visibility across the repo.

  • Add allow(dead_code) to the code that are unused.

  • Removed dead code in grpc/context.rs
      1. StreamingState struct - Removed entirely. The streaming implementation uses local variables within functions instead of this struct.
      2. Removed fields from ResponseState:
        - streaming: StreamingState
        - collected: Option<Vec<ProtoGenerateComplete>>
        - collected_embeddings: Option<Vec<ProtoEmbedComplete>>
        - harmony_parser: Option<HarmonyParserAdapter>
        - harmony_parser_per_index: Option<HashMap<...>>
      3. Removed unused accessor methods from impl RequestContext:
        - request()
        - responses_request()
        - embedding_request()
        - embedding_request_arc()
        - classify_request()
        - classify_request_arc()
      4. Removed unused imports: HashMap, Value, ProtoGenerateComplete

  • Remove unused methods in impl ProtoRequest
      - Removed as_generate() and as_embed() methods since they were truly unused (code uses pattern matching instead)

  • Remove unused methods in impl HarmonyResponsesContext

  • Fix inconsistent patterns in Embedding and Classify
      1. embedding/response_processing.rs:
        - Removed .clone() and Ok(Some(Json(...)))
        - Now returns Ok(None) like Chat/Generate
      2. classify/response_processing.rs:
        - Same change - returns Ok(None)
      3. pipeline.rs execute_embeddings:
        - Changed from error case to extract response from FinalResponse::Embedding(response)
        - Returns axum::Json(response).into_response()
      4. pipeline.rs execute_classify:
        - Same pattern - extracts and returns the response
      5. context.rs FinalResponse:
        - Removed #[allow(dead_code)] from Embedding and Classify variants

  All four endpoint types (Chat, Generate, Embedding, Classify) now follow the same pipeline pattern:
  - Stage stores response in FinalResponse::Variant(response) and returns Ok(None)
  - Pipeline extracts response from FinalResponse and converts to HTTP response

  • Fix inconsistencies in model_id during metrics recording in pipeline.rs
    • Fix execute_generate to use hardcoded UNKNOWN_MODEL_ID constant
    • Remove unnecessary model_for_metrics clone

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments (/tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci) or contact authorized users to do so.
  4. After green CI and required approvals, ask Merge Oncalls to merge.

  1. StreamingState struct - Removed entirely. The streaming implementation uses local variables within functions instead of this struct.
  2. Removed fields from ResponseState:
    - streaming: StreamingState
    - collected: Option<Vec<ProtoGenerateComplete>>
    - collected_embeddings: Option<Vec<ProtoEmbedComplete>>
    - harmony_parser: Option<HarmonyParserAdapter>
    - harmony_parser_per_index: Option<HashMap<...>>
  3. Removed unused accessor methods from impl RequestContext:
    - request()
    - responses_request()
    - embedding_request()
    - embedding_request_arc()
    - classify_request()
    - classify_request_arc()
  4. Removed unused imports: HashMap, Value, ProtoGenerateComplete

  The code is now cleaner with only the actually-used items remaining.
  - Kept the request_id() method in impl ProtoRequest - it was actually being used in dispatch_metadata.rs:32
  - Removed as_generate() and as_embed() methods since they were truly unused (code uses pattern matching instead)
  1. embedding/response_processing.rs:
    - Removed .clone() and Ok(Some(Json(...)))
    - Now returns Ok(None) like Chat/Generate
  2. classify/response_processing.rs:
    - Same change - returns Ok(None)
  3. pipeline.rs execute_embeddings:
    - Changed from error case to extract response from FinalResponse::Embedding(response)
    - Returns axum::Json(response).into_response()
  4. pipeline.rs execute_classify:
    - Same pattern - extracts and returns the response
  5. context.rs FinalResponse:
    - Removed #[allow(dead_code)] from Embedding and Classify variants

  All four endpoint types (Chat, Generate, Embedding, Classify) now follow the same pipeline pattern:
  - Stage stores response in FinalResponse::Variant(response) and returns Ok(None)
  - Pipeline extracts response from FinalResponse and converts to HTTP response
  1. Removed model_for_metrics clone
  2. Use model_id.clone() when passing to context (keeps original for metrics)
  3. Use UNKNOWN_MODEL_ID constant instead of hardcoded "unknown"
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@CatherineSue CatherineSue changed the title [model-gateway] Tighten visibility across model-gateway [model-gateway] Tighten visibility across data_connector and grpc module Jan 5, 2026
@slin1237 slin1237 added the run-ci label Jan 5, 2026
@CatherineSue CatherineSue changed the title [model-gateway] Tighten visibility across data_connector and grpc module [model-gateway] Tighten visibility across data_connector and grpc module Jan 5, 2026
@slin1237 slin1237 merged commit 1751c75 into main Jan 5, 2026
72 checks passed
@slin1237 slin1237 deleted the chang/cleanup branch January 5, 2026 20:31
jamesjxliu pushed a commit to jamesjxliu/sglang that referenced this pull request Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants