Skip to content

gnrc_tcp: syn_rcvd pkt loss fix#10946

Merged
PeterKietzmann merged 1 commit intoRIOT-OS:masterfrom
brummer-simon:gnrc_tcp-fix_syn_rcvd_packetloss_deadlock
Feb 12, 2019
Merged

gnrc_tcp: syn_rcvd pkt loss fix#10946
PeterKietzmann merged 1 commit intoRIOT-OS:masterfrom
brummer-simon:gnrc_tcp-fix_syn_rcvd_packetloss_deadlock

Conversation

@brummer-simon
Copy link
Copy Markdown
Member

Contribution description

This PR fixes #10945. Peter found a bug in gnrc_tcp() connection establishment procedure.

Problem description

The Problem can be provoked by starting a RIOT Node with TCP in passive mode by calling gnrc_tcp_open_passive(). The function does not return until a connection was successfully established
or an error occurred. As soon as the RIOT Node received a SYN Packet, a SYN-ACK is added to the retransmission mechanism. By prohibiting the peer TCP to not send any response to the SYN-ACK,
gnrc_tcp() retransmits SYN+ACK infinitely and can never accept a new connection.

From my point of view this is a very critical issue because either packet-loss or a manipulated Handshake can deadlock a RIOT Node. As a side note: This Problem is not mentioned in the TCP RFC
so there is no definite strategy to address this.

Fix description

TCP uses by default a two minute timeout to evaluate if a connection is alive. If the retransmission mechanism is longer that two minutes not successful in transmitting packets a connection is viewed as lost.

This fix starts a timer as soon as the SYN Packet is received from the peer. This Timer is stopped as soon as the connection is established. If the timer times out after two minutes the retransmission queue is cleared and the TCB reverts back to the LISTEN state waiting for the next incoming SYN.

@PeterKietzmann PeterKietzmann added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: network Area: Networking labels Feb 5, 2019
@PeterKietzmann
Copy link
Copy Markdown
Member

I can confirm that this PR fixes the second issue which was reported in #10945. After two minutes, the
server is able to establish a new connection again. It would be good if

@smlng
Copy link
Copy Markdown
Member

smlng commented Feb 5, 2019

if the code fixes the issue, I would say go.

However, I wonder why this was a problem anyway. I would've assumed that the timeout is always running in the background and is only reset if certain message/events occurred, thus a timeout would be raised eventually no matter what.

But that would need some further thinking and code rework - I guess.

@brummer-simon
Copy link
Copy Markdown
Member Author

brummer-simon commented Feb 5, 2019

Determining if a connection is timedout is, by the suggestions in the TCP RFC, not part of the retransmission mechanism.

To determine if a connection has timedout is either done by the RST Flag processing (as reply to retranmissions) or by a timeout that basically has the semantics: Abort the current operation because it takes to long.

This timeout is used by every gnrc_tcp Function, except the passive opening because it doesn't make sence there. For a passive open, the user wants to wait until a connection was established.

In this case there was no such timeout. That lead to infinite SYN+ACK retransmissions and
due to missing ACKs never reverted back into a State where a new connection could be initiated again.

If you are interested on the details, we can talk about it on the next Hack'n'Ack.

@smlng
Copy link
Copy Markdown
Member

smlng commented Feb 5, 2019

as said: if this fixes an issue, let's go for it. I trigger CI ...

@smlng smlng added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 5, 2019
Copy link
Copy Markdown
Member

@PeterKietzmann PeterKietzmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My approval holds. Two things not to forget before merging: (i) should we wait for @sabyse to test missing SYN-ACK ACK deadlock and (ii) @aabadie this would be a backport candidate right :/?

@PeterKietzmann
Copy link
Copy Markdown
Member

I've tested the SYN-ACK ACK deadlock which is fixed with this PR. @brummer-simon please provice a backport to the release branch. ACK and go

@PeterKietzmann PeterKietzmann merged commit 892a373 into RIOT-OS:master Feb 12, 2019
@aabadie aabadie added this to the Release 2019.01 milestone Feb 12, 2019
@PeterKietzmann PeterKietzmann added the Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch label Feb 12, 2019
@brummer-simon brummer-simon deleted the gnrc_tcp-fix_syn_rcvd_packetloss_deadlock branch February 26, 2019 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

net/gnrc_tcp: timeout of connection establishment

4 participants