Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Code Review
This pull request introduces the "Auto Reasoning Effort" feature, which normalizes model names with suffixes like -xhigh or -max into a base model name and an associated reasoning_effort value. The changes include a new UI setting, backend schema updates, and a middleware component in the orchestration pipeline, along with specific handling for DeepSeek models in the Anthropic transformer. Feedback suggests ensuring the middleware respects explicitly provided reasoning_effort values as documented, removing unused fields and parameters in the new middleware, and cleaning up redundant logic in the model splitting function.
|
|
||
| originalModel := llmRequest.Model | ||
| llmRequest.Model = baseModel | ||
| llmRequest.ReasoningEffort = reasoningEffort |
There was a problem hiding this comment.
The implementation currently overrides llmRequest.ReasoningEffort unconditionally. However, the documentation in internal/server/biz/system.go (line 260) states that the suffix should only be applied if the request did not already specify reasoning_effort explicitly. To align with the intended behavior, you should check if ReasoningEffort is empty before assigning the value from the suffix.
if llmRequest.ReasoningEffort == "" {
llmRequest.ReasoningEffort = reasoningEffort
}| type autoReasoningEffortMiddleware struct { | ||
| pipeline.DummyMiddleware | ||
|
|
||
| inbound *PersistentInboundTransformer |
| func applyAutoReasoningEffort(inbound *PersistentInboundTransformer, systemService *biz.SystemService) pipeline.Middleware { | ||
| return &autoReasoningEffortMiddleware{ | ||
| inbound: inbound, | ||
| systemService: systemService, | ||
| } |
There was a problem hiding this comment.
The inbound parameter is unused in the middleware implementation. You should remove it from the function signature and the struct initialization.
func applyAutoReasoningEffort(systemService *biz.SystemService) pipeline.Middleware {
return &autoReasoningEffortMiddleware{
systemService: systemService,
}
}| if base == "" { | ||
| return "", "", false | ||
| } |
| // Add inbound middlewares (executed after inbound.TransformRequest) | ||
| middlewares = append(middlewares, | ||
| enforceQuota(inbound, processor.QuotaService), | ||
| applyAutoReasoningEffort(inbound, processor.SystemService), |
| name: "suffix reasoning effort overrides explicit request value", | ||
| settings: biz.SystemModelSettings{ | ||
| AutoReasoningEffort: true, | ||
| }, | ||
| request: &llm.Request{ | ||
| Model: "gpt-5.4-xhigh", | ||
| ReasoningEffort: "high", | ||
| }, | ||
| wantModel: "gpt-5.4", | ||
| wantEffort: "xhigh", | ||
| }, |
There was a problem hiding this comment.
If the middleware is updated to respect explicit ReasoningEffort values (as suggested in auto_reasoning_effort.go), this test case should be updated to reflect that the explicit value high should be preserved instead of being overridden by xhigh.
| name: "suffix reasoning effort overrides explicit request value", | |
| settings: biz.SystemModelSettings{ | |
| AutoReasoningEffort: true, | |
| }, | |
| request: &llm.Request{ | |
| Model: "gpt-5.4-xhigh", | |
| ReasoningEffort: "high", | |
| }, | |
| wantModel: "gpt-5.4", | |
| wantEffort: "xhigh", | |
| }, | |
| { | |
| name: "explicit reasoning effort is preserved over suffix", | |
| settings: biz.SystemModelSettings{ | |
| AutoReasoningEffort: true, | |
| }, | |
| request: &llm.Request{ | |
| Model: "gpt-5.4-xhigh", | |
| ReasoningEffort: "high", | |
| }, | |
| wantModel: "gpt-5.4", | |
| wantEffort: "high", | |
| }, |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Uh oh!
There was an error while loading. Please reload this page.