-
Notifications
You must be signed in to change notification settings - Fork 166
MacOS stress test #2408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MacOS stress test #2408
Conversation
67ab4d2 to
bc8c77d
Compare
|
This is quite well timed, thanks! We just received a new mac mini that will be used for EOS/XRootD development and CI. I will try this there as well. |
6068f0f to
284311e
Compare
284311e to
5a87b83
Compare
|
This pull requests has warnings in CDash, and a test failure on Alpine Linux. @bbockelm Could you please investigate? Thanks. |
|
Happy to do that! I was a bit confused by the Mac mini comment and whether you were going to pick things up. Anyhow, now that |
|
Yes, I do intend to test on the new Mac mini, but the failures in CI are on Alpine and Ubuntu. Submitted changes should pass the CI before we do further testing on them. |
5a87b83 to
1fff941
Compare
|
@amadio - tinkered with this a bit more yesterday. Some observations:
|
1fff941 to
59bb5e7
Compare
|
Alpine uses MUSL instead of glibc. The failure is likely related to that. |
|
Ah -- actually, I think it was me misreading the test logs from the prior run. On closer inspection, there's something basic missing on my added new unit test which seems quite fixable. |
|
@amadio - can you re-run the Mac OS X job a handful of times? I lack the permission to re-run tests in GitHub. I want to make sure that the tests are reliably passing and this isn't a fluke where I got lucky. The larger refactor -- rewriting the logic so only one thread writes to the table at once (which would obviate needing making a copy of the table) -- was unsuccessful, unfortunately. |
450c8ae to
d52d48c
Compare
If we want to programmatically access the HTTP port (instead of hardcoding it in the configuration), then it will reduce code duplication to simply use the same port.
This exposes a small gtest-driven stress test that will probe ever-increasing number of concurrent uploads, looking for a point where the server may stop responding.
Previously, MacOS failed the stress test because the poller was not thread safe. Various threads manipulated the poll table and read from it without a lock consistently held. This now ensures the PollMutex lock is held whenever we read or write from the PollTab.
d52d48c to
1cf9f55
Compare
|
@amadio - ok, tests have passed several times in a row so I think I'm happy. |
This implements a simple HTTP-based stress test that, on my laptop, fails 100% of the time. All it does is try an increasing number of concurrent HTTP uploads.
By "stress tests", it tends to fail at 10 concurrent transfers provided this line is set:
If, instead, that line is:
then I've never gotten the stress test to fail, even with more than 100 concurrent transfers. Note --
CURLOPT_FORBID_REUSE, when set, causes a new TCP socket to be created for each transfer. By turning it off, there's multiple requests on a single TCP socket.I believe this is due to the fact that HTTP is such a simple protocol the poller only fires once per socket unless there is socket reuse -- at which point we trigger the race condition mentioned in #2212.
Hopefully the GitHub runner will trigger the failure just as reliably as my laptop!