Skip to content

use 2x buffer when sysctl fails with ENOMEM#591

Merged
wfurt merged 2 commits intodotnet:masterfrom
wfurt:sysctl_42634
Dec 6, 2019
Merged

use 2x buffer when sysctl fails with ENOMEM#591
wfurt merged 2 commits intodotnet:masterfrom
wfurt:sysctl_42634

Conversation

@wfurt
Copy link
Member

@wfurt wfurt commented Dec 5, 2019

https://github.com/dotnet/corefx/issues/42634 is caused by endless spin in System.Native when we fail to process routing table on first try. It was reported only on Catalina but that may be just timing difference.

It seems like existing code assumes that when sysctl() fails with ENOMEM, oldlenp would have new needed size. But that is not happening. According to 10.14 man page:

The information is copied into the buffer specified by oldp.  The size of the buffer is given by the location specified by oldlenp before the call, and that location gives the amount of data copied after
 a successful call and after a call that returns with the error code ENOMEM.  If the amount of data available is greater than the size of the buffer supplied, the call supplies as much data as fits in the
 buffer provided and returns with the error code ENOMEM.  If the old value is not desired, oldp and oldlenp should be set to NULL.

 The size of the available data can be determined by calling sysctl() with the NULL argument for oldp.  The size of the available data will be returned in the location pointed to by oldlenp.  For some
 operations, the amount of space may change often.  For these operations, the system attempts to round up so that the returned size is large enough for a call to return the data shortly thereafter.

I verified that if sysctl fails with ENOMEM and provided buffer, oldp has still old value e.g. amount written. I also verified that if we do another sysctl() call with buffer set to NULL, it provides back updated estimate. That however seems expensive as we need context switch to kernel and kernel will need to process routing table to get new estimate. Instead I decided to use same strategy as we use to dump tcp socket list and simply increase buffer 2x. That may temporarily need more memory but should be much cheeper on CPU processing.

cc: @ryanbrandenburg

fixes https://github.com/dotnet/corefx/issues/42634

@wfurt wfurt requested a review from a team December 5, 2019 21:32
@wfurt wfurt self-assigned this Dec 5, 2019
@danmoseley
Copy link
Member

3.1 servicing candidate maybe?

@wfurt
Copy link
Member Author

wfurt commented Dec 6, 2019

yes, I was thinking about it as the failure mode is rather unpleasant and there is no workaround.

}

byteCount = tmpEstimatedSize;
buffer = realloc(buffer, byteCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment (similar to your PR description) as to why we're not just using byteCount as returned by sysctl?

@ryanbrandenburg
Copy link
Contributor

@wfurt it's unclear to me why I was CC'd, does this relate to some work I've done that I'm forgetting about?

@wfurt
Copy link
Member Author

wfurt commented Dec 6, 2019

sorry fir the noise @ryanbrandenburg. I cc you because it is in same area as https://github.com/dotnet/corefx/issues/41740 you opened while back.

@wfurt wfurt merged commit c79b70e into dotnet:master Dec 6, 2019
@wfurt wfurt deleted the sysctl_42634 branch December 6, 2019 22:18
@zekuny
Copy link

zekuny commented Dec 9, 2019

Thanks for the quick fix. Is this in fix in the latest Master (5.0.x Runtime)? Can I download the installer (5.0.100-alpha1-015914) to test the fix now? Do you have an ETA for the 3.1 servicing pack?

@karelz karelz added this to the 5.0.0 milestone Aug 18, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants