Skip to content

Conversation

@imsodin
Copy link
Member

@imsodin imsodin commented Nov 23, 2020

Doing some testing for untrusted feature I noticed that errors processing cluster-config didn't result in the connection being dropped on the remote device. We are currently just closing the underlying connection in that case, not sending a BEP close-message, thus the connection first needs to time out on the remote to be noticed as closed.

@calmh
Copy link
Member

calmh commented Nov 23, 2020

Change is probably fine, but reasoning behind is suspect. The other side is waiting in a read and closing the connection should make that read return immediately, no timeout required.

@imsodin
Copy link
Member Author

imsodin commented Nov 23, 2020

The errors occur after reading the messages from the wire, i.e. the other side isn't waiting anymore thus doesn't see the closed connection. In the specific case I was looking at the error occurred on sending an index and the sender didn't notice the connection closing for quite a while.

@calmh
Copy link
Member

calmh commented Nov 24, 2020

I don't understand what you mean "after reading messages from the wire". Messages are read in readerLoop -- any read error there results in the connection being closed. If the underlying (TCP) connection is closed on the other side, read should return EOF immediately here (because we get a FIN or RST packet), resulting in closed connection on this side too.

@imsodin
Copy link
Member Author

imsodin commented Nov 24, 2020

Ah yes, now I see: I wrote "read" in the PR title. The error in question here occurs on "dispatch". The reader loop reads the message, and if an error occurs there it will happen just as you say, but it then passes the message on to the dispatcher loop. And if an error occurs there, we close the connection, but whoever sent us the offending message is already finished sending it at that point.

@calmh
Copy link
Member

calmh commented Nov 24, 2020

Right, but they should also be waiting for the next message from us in their readerLoop, and when we close the connection they should see it there?

@imsodin
Copy link
Member Author

imsodin commented Nov 24, 2020

Right, and there's the actual (or an additional) problem: The protocol connection doesn't know about the underlying connection, i.e. it doesn't/can't close it. Thus an error occurring in the protocol layer doesn't close the actual connection. That could be fixed by calling Close in model.Closed, but that's ugly, as it means closing it twice in most cases. That's a noop for the protocol connection, but according to io.Closer unspecified behaviour in general, so maybe not a good thing to do on the underlying connection. I'll have to think on how to fix this cleanly.

I believe the changed proposed here is valuable regardless, if only to generate a more meaningful error message on the remote than just "connection closed" (or similar).

@calmh
Copy link
Member

calmh commented Nov 24, 2020

Oh! So the actual TCP connection isn't closed? That's real ugly and should/must indeed be fixed, and breaks my mental model of what was going on. I'll look at this in an entirely new light then, and indeed sending the Close message isn't bad. Perhaps we should look at doing that consistently.

@imsodin
Copy link
Member Author

imsodin commented Nov 24, 2020

It is when the model closes the connection, but not when the protocol closes it (which only happens on errors).
I think I made an attempt a long time ago, where instead of creating a composite connection from the protocol and underlying connection, the underlying connection is nested inside protocol (abstracted of course), thus the closing can happen there. However it either created problems of its own or became just too large a change, in any case I didn't pull through (obviously). Conceptually I still think it's the right thing to do: protocol sits above tls and thus should control it (and model controls protocol).

@calmh calmh merged commit bbb22c8 into syncthing:main Nov 27, 2020
@imsodin imsodin deleted the protocol/sendReadErr branch November 27, 2020 10:34
@calmh calmh added this to the v1.12.1 milestone Nov 27, 2020
@st-review st-review added the frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion label Nov 28, 2021
@syncthing syncthing locked and limited conversation to collaborators Nov 28, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

frozen-due-to-age Issues closed and untouched for a long time, together with being locked for discussion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants