codex-rs/app-server: graceful websocket restart on Ctrl-C#12517
codex-rs/app-server: graceful websocket restart on Ctrl-C#12517
Conversation
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 11886fcd12
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-authored-by: Codex <noreply@openai.com>
bolinfest
left a comment
There was a problem hiding this comment.
I have to come back to this later, but here are my comments thus far.
codex-rs/app-server/tests/suite/v2/connection_handling_websocket.rs
Outdated
Show resolved
Hide resolved
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
owenlin0
left a comment
There was a problem hiding this comment.
Once we receive a ctrl-c I believe we'd want to prevent 1) existing clients from triggering new turn/start and 2) new clients from connecting?
ideally if they do they get a JSON-RPC error indicating that the server is shutting down.
from codex: "After the first Ctrl-C, this loop keeps processing IncomingMessage and ConnectionOpened, and the websocket acceptor is only canceled once running_turn_count reaches zero. In practice that means existing clients can still start another turn/start, and brand new websocket clients can still connect, which can bump the running-turn count back up and postpone the restart indefinitely. That defeats the core guarantee of a graceful drain: once shutdown is requested, the server should stop taking new work and wait only for already-running turns to finish."
codex-rs/app-server/tests/suite/v2/connection_handling_websocket.rs
Outdated
Show resolved
Hide resolved
|
That's intended and IMO better behavior. Turns can take a very long time, it would be bad for the server to be "down" for new turns that whole time. At the same time, given that these tend to be used by a single user, it's likely that there will be a "gap" in turns eventually where a restart is safe. An even better solution is to have another instance ready to accept new turns while existing ones complete on the old instance, but that's complicated enough I'd like to ship this first. |
Co-authored-by: Codex <noreply@openai.com>
Summary
Verification
cargo check -p codex-app-server --quietcargo test -p codex-app-server --test all suite::v2::connection_handling_websocket