Removed sock fd assert on IpSocket#3973
Conversation
|
I am not sure this is the fix we want. The intent of these assertions was to ensure that the file descriptor is not an invalid file descriptor in both Look at the receive code: fprime/Drv/Ip/SocketComponentHelper.cpp Lines 163 to 172 in e7840de Here, we clearly check to avoid a case where the assert would trip unnecessarily. The send code is a bit different. fprime/Drv/Ip/SocketComponentHelper.cpp Lines 112 to 120 in e7840de Here it attempts to reopen a failed descriptor, checks status, and then persists another copy. This leads to a race condition because something else (e.g. This code should instead:
@vincewoo does this make sense? |
|
We should also put back in the assert in the |
|
@jsilveira1409 do you want to attempt the above change, or shall I? |
|
@LeStarch hey thanks for the feedback! I'll give it a shot in the coming days and propose a solution! Thanks |
|
It turns out this is a whole lot more complicated that first assumed. The crash comes from the fact that in the TcpServer re-connections are potentially a long process (wait listening for some unknown client). We need this fix for the release today, so I am going to go ahead and make it as it affects multiple files. For the time being, I will merge this PR as it will partially address the problem, and will give you credit. |
|
We looked at this yet deeper, and decided this was the correct fix for now. A better fix would be to push re-connection requests onto their own thread, and get rid of this weird inter-threading logic. However, for now this code results in an error when threads are reconnecting and haven't completed. This is the stop-gap solution. |
|
I haven't had time to look more into it yet but will study it deeper when I do! Thanks for the merge! |
|Has Unit Tests (y/n)| |
|Documentation Included (y/n)| |
Change Description
Removed unnecessary assert for the
socketDescription.fdwhen sending data out.Rationale
When the FSW streams out data to a e.g. TCP client, and the client disconnects abruptly, the assert get's triggered, crashing the FSW instance, instead of letting it fail to send in the
sendProtocol()function further below in theIpSocket::send()function.Testing/Review Recommendations
Future Work