server : fix proxy client timeout in router mode#22003
Conversation
|
Hi @xris99, thanks for your contribution! Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:
Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below. |
| cli->set_write_timeout(timeout_read, 0); // reversed for cli (client) vs srv (server) | ||
| cli->set_read_timeout(timeout_write, 0); |
There was a problem hiding this comment.
this is supposed to be reversed, right? ref the comment that you deleted
There was a problem hiding this comment.
The comment is an incorrect reasoning and the timeouts are swapped wrongly. From the httplib client's perspective, read = reading the response, write = sending the request body, which maps directly to the server-side meanings. In consequence the timeouts must not be swapped.
There was a problem hiding this comment.
IIRC the timeout_read/write is read from CLI arg which is intended to be set to server. have you acknowledge where this params come from?
There was a problem hiding this comment.
Hmm,... this actually does not make sense at all and could be significantly simplified.
timeout_read and timeout_write are always identical. The --timeout CLI arg sets both to the same value (arg.cpp l2960–2961), defaults are also 600 for both. They cannot be set individually. Swapping them has zero effect; never matteres. I can revert and insert the swap again, but it has zero effect. Just tried to clean up a bit
There was a problem hiding this comment.
you should revert the comment too (in case someone else trying to do the same change here)
There was a problem hiding this comment.
You are right. reverted completely incl. the comment. Sorry for that. Updated with last commit. single line change now for the router timeout. (the actual topic)
43c0d1f to
fd19ccb
Compare
|
You are right, I concede the point on the swap. |
fd19ccb to
d664234
Compare
ngxson
left a comment
There was a problem hiding this comment.
@ServeurpersoCom can you merge this? thanks!
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
…org#18760) (ggml-org#22003) Fixes: ggml-org#18760 Co-authored-by: Christian <christian@example.com>
Problem
In router mode (
--models-dir/--models-preset), the internal HTTP proxy client inserver_http_proxyhad two bugs that causedhttp client error: Failed to read connectionerrors:--timeoutparameter entirely, causing failures when child model workers take longer than 5 seconds to accept the connection (common for large CPU-bound models).set_write_timeoutwas fedtimeout_readandset_read_timeoutwas fedtimeout_write. The original comment acknowledged this with "reversed for cli (client) vs srv (server)" but the reasoning was wrong — the names mean the same thing from both sides.This made router mode unusable for large models (70B+) on CPU where a single request can take 30 minutes or more.
Fix
Three-line change in
tools/server/server-models.cpp:timeout_read(same as the read timeout) instead of hardcoded 5 sset_read_timeout←timeout_readset_write_timeout←timeout_writeThis matches exactly how
server-http.cpp(lines 115–116) sets timeouts for the non-router path, making--timeoutconsistent across both modes.Testing
Starting server in router mode with a large CPU model and
--timeout 36000no longer producesFailed to read connectionafter the default httplib timeout.Fixes #18760