fix: ensure concurrent server always responds to shutdown signal#3291
Conversation
| Ok(()) | ||
| ); | ||
| } | ||
|
|
There was a problem hiding this comment.
I was slightly horrified to see how complicated the test case above is. I've tested the code more directly (i.e. without the actor actually running, just polling next_request manually) as a result. A better solution might be to try and abstract away some of the setup of the above test, and then run essentially the same test just with fewer in-flight requests to test the correct code path.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files📢 Thoughts on this report? Let us know! |
Robot Results
|
| ); | ||
| } | ||
|
|
||
| #[tokio::test] |
There was a problem hiding this comment.
Good to have these two new tests actually detecting the issue. However, it would be could to put them in a test module that doesn't require cfg(feature = "test-helpers").
f092624 to
8d1b650
Compare
Proposed changes
This fixes an issue that was causing
tedge-mapperto fail to shutdown properly in certain cases, namely when the HTTP actor was busy when the shutdown request was received.The root cause is in these lines of code
thin-edge.io/crates/core/tedge_actors/src/servers/message_boxes.rs
Lines 51 to 65 in 82cde40
When we receive a
RuntimeRequest::Shutdown,self.requests.recv()returnsNone. Since this doesn't matchSome(request), this branch then becomes inactive and we wait forself.running_requests_handlers.next(). If any request handlers are running, when this returns, the code will loop, runningself.requests.recv()again, meaning we've completely ignored the shutdown request and will never receive another one.A similar thing happens if the running requests do not finish in a timely manner after the shutdown request is received. We have a unit test for this, however this only tests the case where the actor is at capacity (i.e. we have a task waiting that cannot be scheduled due to the concurrency limit). Given in this case the actor stops running immediately upon receiving the shutdown request, I have replicated this behaviour for the case where the actor is not at capacity.
Types of changes
Paste Link to the issue
Checklist
cargo fmtas mentioned in CODING_GUIDELINEScargo clippyas mentioned in CODING_GUIDELINESFurther comments