Skip to content

Conversation

@Mellvik
Copy link
Owner

@Mellvik Mellvik commented May 27, 2024

Fix for a minor problem in ktcp (tcp_output.c) which caused lost SYN packets to hang/timeout a connection attempt.
A SYN packet has sequence # (relative) zero, which is also the initialized value of the variable containing the last ACKed value, so tcp_retrans_expire() would immediately kick the SYN out of the retrans list.

SYN is to be treated like FIN as far as ACKs go, expecting an ACK for the starting sequence number +1. Which means the fix is very simple - adding TF_SYN to the statement handling FINs:

if (n->tcphdr[0].flags & (TF_SYN | TF_FIN)) datalen++;

@ghaerr
Copy link

ghaerr commented May 28, 2024

Hi @Mellvik,

Interesting problem you've solved here. I've been away from the ktcp codebase for a bit, can you explain more exactly what is happening? This SYNs issue is only seen with a new connection, right? Is it that the SYN packets are being placed in the retransmit list, but then they are being dropped because of a happenstance match of the ACK and SYN starting sequence numbers matching when they should not?

Without diving back into this at the moment, I'm not quite realizing the issue of why the FIN is treated specially by the retransmit code, and why a SYN should also be treated the same. I am guessing its because a special timeout approach is needed for ACK, rather than the standard ACK process during data transfer.

Another question is whether the SYN/connect starting sequence number should be something other than 0 to solve this, or whether that would not in fact make a difference. (I recall that it was contemplated to set the starting sequence number to a random number, but did not in order to keep debugging simpler some time ago).

Thank you!

@Mellvik
Copy link
Owner Author

Mellvik commented May 28, 2024

Howdy @ghaerr -
And yes, id does indeed always take a while to reacquaint with ‘old’ code (aka memory refresh).

Actually this is very simple in that it’s about doing what the spec says (which unsurprisingly also makes sense). Since SYN and FIN are ‘empty’ packets, the sender is to expect an ACK with a +1 sequence number.

This SYNs issue is only seen with a new connection, right? Is it that the SYN packets are being placed in the retransmit list, but then they are being dropped because of a happenstance match of the ACK and SYN starting sequence numbers matching when they should not?

This is spot on. tcp_retrans_expire finds a sequence number of zero which unsurprisingly is what the last-ack variable is set to, and the packet is ejected from the retransmit queue.

The SYN being a minimal packet combined with an incorrectly configured NIC caused outgoing telnet to hang and I got curious ...

-M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants