Skip to content

Conversation

@CapnBry
Copy link
Member

@CapnBry CapnBry commented Dec 9, 2025

Extend the timeout of the socket MSP connection from 2s to 10s. This is to prevent disconnects when the link is underutilized by the flight controller Configurator apps.

Fixes #3410? This can also be backported to 3.x.x-maintenance with just a 2 byte change in TCPSOCKET.

Details

In the 3.5.5 socket MSP system, the code would just memory leak the socket pointer if the connection was idle more than 2s. If new data would come in, the pointer would be reattached to the pump process. This meant there was effectively no timeout on incoming data as the connection would just sort of fix itself on the next packet.

When I refactored the code and fixed the memory leaks, this made some Configurators unreliable. In particular the test case of Rotorflight 4.5.1 with Configurator 2.2.1 has pauses you can drive a truck through. Any time the Configurator would stop loading mid-way through a tab, we'd drop the connection and force it closed. You can see these happening in Wireshark

image

The first one there was just 90ms of idle time away from us dropping the connection while we (x.x.x.151) waited on the next command request from the Configurator (x.x.x.145). The one further down was over a second waiting to get the MSP data from the FC to reply with.

This RF Configurator shows tab load times all over the place and I see a lot of redundant data being requested so it is likely this could be improved on the Configurator side to increase the performance. The upcoming betaflight PWA Android app loads things much more consistently and did not need any timeout extension. Here's RF tab load times before this PR

Tab 3.5.5 3.6.2 4.0-RC1
Status 4.18s 2.01s 2.03s
Setup 4.18s 7.46s 1.01s
Config 3.45s 5.53s 7.72s
Receiver 6.91s 4.13s (disconnected once) 14.61s (disconnected 8x, increased timeout)
Failsafe 9.19s 9.23s 9.15s
Power 4.27s 2.57s 2.13s
Motors 7.93s 6.06s 3.43s

Quick Fix

I just upped the timeout from 2s to 10s and now can not reproduce the issue. The 2s was chosen because it matched the old leaky code, but it seems like 2s might also be the Rotorflight Configurator's internal timeout or something because I saw quite a few ~2s pauses during my testing. We can even make this longer, but because we don't maintain the currently active socket list I'd be wary of setting it too long and having multiple connections getting their data mixed somehow, as we only have one pipe to the flight controller. We also could force a connection closed if a new one comes in, but let's go with this for now.

Debug

I also returned some debug messages from before the 4.0 refactor and cleaned up the existing ones to be shorter and consistent.

TCP(3fcc2500) read 6
TCP(CRSF) msg 37
TCP(3fcc2500) write 36
TCP(3fcc2500) read 6
TCP(CRSF) msg 37
TCP(3fcc2500) write 36
TCP(3fcc2500) read 6
TCP(CRSF) msg 37
TCP(3fcc2500) write 36
TCP(3fcc2500) read 6
TCP(3fcc2500) read 6
TCP(CRSF) msg 62
TCP(CRSF) msg 62
TCP(CRSF) msg 39
TCP(3fcc2500) write 152
TCP(3fcc2500) read 6
TCP(3fcc2500) read 7
TCP(CRSF) msg 62
TCP(CRSF) msg 62
TCP(CRSF) msg 39
TCP(3fcc2500) write 152
TCP(3fcc2500) read 7
TCP(CRSF) msg 7
TCP(3fcc2500) write 6
TCP(3fcc2500) disconnected

@CapnBry
Copy link
Member Author

CapnBry commented Dec 12, 2025

I was keeping this open hoping the reporter would test and confirm but I'll just move ahead on the 3.x branch change.

@CapnBry CapnBry closed this Dec 12, 2025
@CapnBry CapnBry reopened this Dec 12, 2025
@CapnBry CapnBry merged commit c1d32db into ExpressLRS:master Dec 12, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FC TCP connection keep breaking up through ELRS WiFi

3 participants