recv_reset resets closed streams with queued EOS frames#247
recv_reset resets closed streams with queued EOS frames#247carllerche merged 7 commits intohyperium:masterfrom
Conversation
carllerche
left a comment
There was a problem hiding this comment.
Thanks for digging into this. The idea of the fix looks good to me.
As discussed on Slack, I believe that instead of transitioning to Closed(Scheduled), we would want to transition to some other state. This is because, ideally, we would not send out a RST_STREAM.
tests/stream_states.rs
Outdated
| .idle_ms(1) | ||
| // Send the RST_STREAM frame which causes the client to panic. | ||
| .send_frame(frames::reset(1).cancel()) | ||
| .recv_frame(frames::reset(1).cancel()) |
There was a problem hiding this comment.
I believe that, ideally, we would avoid sending a RST_STREAM. While the peer should be able to tolerate this, it is not necessary as a RST_STREAM in either direction results in the stream transitioning to closed.
There was a problem hiding this comment.
Yeah, I think I misunderstood a sentence in the spec saying that we should allow this as saying that we should do this. Had I known that earlier, this PR would have been finished much quicker. My bad!
Corrected this behaviour in 861b109.
carllerche
left a comment
There was a problem hiding this comment.
LGTM, just one request to remove commented out code.
@seanmonstar thoughts?
src/proto/streams/state.rs
Outdated
| } | ||
|
|
||
| // /// Returns true if the stream was closed by sending an EOS. | ||
| // pub fn is_eos(&self) -> bool { |
There was a problem hiding this comment.
Oops, I meant to mention this in the first review, would it be possible to remove this commented out code?
There was a problem hiding this comment.
Oh, yeah, missed that when committing. Sorry.
| pub fn recv_reset(&mut self, frame: frame::Reset, stream: &mut Stream) { | ||
| // Notify the stream | ||
| stream.state.recv_reset(frame.reason()); | ||
| stream.state.recv_reset(frame.reason(), stream.is_pending_send); |
There was a problem hiding this comment.
I'd love to eventually make the state private. It reads weird that we need to pass in extra information about the stream itself. (Not your fault, I'm just pining for a better future some day :D).
There was a problem hiding this comment.
As far as I can tell, there doesn't appear to be any harm in making the forcing the transition for all Closed(Cause::EndStream) states on receipt of RST_STREAM regardless of whether or not the stream has pending send; but @carllerche suggested that we only transition to Closed(Cause::Proto) if the stream is queued.
There was a problem hiding this comment.
I prefer to stay on the side of conservative and only make changes that we know are needed.
If there are no queued send frames, then the stream state is actually closed and the received reset should be ignored (I think this is already done)
There was a problem hiding this comment.
I don't think I was clear. I get why we're using the queued information, that's fine! We don't want to trigger a reset error for a stream the user has already closed locally.
I was more concerned that it's weird that we pass information of the stream as an argument onto the state property. I'd want to one day change the methods on the state to be methods on stream instead.
In other words, ignore me. I'm just sleep deprived. Move along, nothing to see here!
| // Idling for a moment here is necessary to ensure that the client | ||
| // enqueues its TRAILERS frame *before* we send the RST_STREAM frame | ||
| // which causes the panic. | ||
| .idle_ms(1) |
There was a problem hiding this comment.
Is this racey at all? Maybe we need a way to tell the mocks to wait on a future, to allow for synchronization?
There was a problem hiding this comment.
It could be potentially racy; changing the mocks to allow waiting on a future seems like a better approach long-term.
| // If the stream has a queued EOS frame, transition to peer | ||
| // reset. | ||
| trace!("recv_reset: reason={:?}; queued=true", reason); | ||
| self.inner = Closed(Cause::Proto(reason)); |
There was a problem hiding this comment.
So this fixes it because while the stream is still queued, since it's state is changing, the queue later on skips this frame, right?
Fixes #246. Fixes linkerd/linkerd2#607.
An issue currently exists where
h2will panic when a stream transitions fromHalfClosed(remote)toClosed(Cause::EndStream)by enqueuing an EOS frame, and then receives aRST_STREAMframe before the queued EOS frame is sent.In this case, the
expectinprioritize::pop_framehttps://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L693-L694 is called on a frame which does not have a scheduled reset, because it is in theClosed(Cause::EndStream)state. However, the guard which skips this code when the stream is peer reset https://github.com/carllerche/h2/blob/f8baeb7211985834acaf7ecc70548aafd5d9ef6d/src/proto/streams/prioritize.rs#L584-L586 doesn't cause us to skip in this case, because the stream is in theClosed(Cause::EndStream)state.I've modified
streams::State::recv_resetto take a boolean parameter indicating whether the stream receiving a reset currently has a frame queued to send. If the stream is in theClosed(Cause::EndStream)state, rather than doing nothing, we check whether there is a queued frame. If so, the stream transitions toClosed(Cause::Proto)with the reason for the received reset set as the reason.This way, when
prioritize::pop_framesees this stream, it will be treated by any other reset stream, draining the queue and transitioning to reset. We no longer hit theexpectcall that was causing panics in this case.I've also added a unit test that reproduces the crash against master.