test3026: try to reduce runtime within legacy mingw build environments#9412
test3026: try to reduce runtime within legacy mingw build environments#9412mback2k wants to merge 3 commits intocurl:masterfrom
Conversation
The test should not even take 1 minute, why does it take 15? |
|
If a 100 threads take 20 minutes, won't 42 threads then still take a rather excessive 8.4 minutes? Maybe the test should rather be changed to have a max time for which it can execute? |
|
@jay not sure, on my machine it takes less than a minute. I think the CI VMs have trouble with great parallelism due to limited number of cores. @bagder I wasn't sure if the issue scaled in a linear fashion. I thought let's try 42 and see what happens. As stated above, I think the main problem is the thread pressure on our CI hosts. Locally it works fast and fine as expected. |
|
As I guessed it does not seem to be a linear issue. Before / without this PR: https://ci.appveyor.com/project/curlorg/curl/builds/44661247/job/fq5hqnkj4bjr90p5#L4987 After / with this PR: https://ci.appveyor.com/project/curlorg/curl/builds/44653968/job/hk9etjq6tuwo0ca7#L4988 Overall this seems to cut quite some time from the AppVeyor CI runs which run tests. Before vs. After cmake builds (top 4 MSVC, bottom 4 MSYS/MinGW): Before vs. After autotools builds (3 MSYS/MinGW and 1 CygWin): Seems like this is primarily affecting the 4 cmake-based MSYS/MinGW in the middle. |
|
To sum up, only builds using the following compilers on AppVeyor are affected by the slowness after all:
So I guess it is their threading model which is causing the trouble. @MarcelRaad any suggestions? Edit: confirmed also by Azure CI running classic MinGW and similar variants:
|
|
This is still set a draft but wants a review. Isn't that contradictory? |
74c6442 to
c583bb1
Compare
|
I will need to reproduce this locally and figure out a better way to shorten the runtime for legacy MinGW builds. |
|
Update: I was able to reproduce this locally with classic/legacy MinGW. Next up is identifying the root cause / finding workaround. |
|
Root cause: the slow part is loading of |
|
Should we just close this? |
I pushed a change to your branch that calls loadlibrary on secur32 and iphlpapi in the main thread. This way each time curl_global_init/cleanup is called from the worker threads and it calls loadlibrary/freelibrary on those two libraries it should just increase/decrease their refcount. This may or may not work depending on if initialization of the library is taking a long time, and not the actual loading. |
|
@jay thanks, it is definitely the loading part according to my tests. But even reversing the binaries and running them outside of msys-shells (v1 or v2) did not help me identify the difference between the various fast (mingw-w64 from msys2) and slow (any mingw on msys1) build variants. |
|
@jay to resolve the merge conflict which is blocking Azure Pipelines and AppVeyor you may delete my remaining commit. |
e32972f to
b1a0119
Compare
- Load Windows system libraries secur32 and iphlpapi beforehand, so that libcurl's repeated global init/cleanup only increases/decreases the library's refcount rather than causing it to load/unload. Assisted-by: Marc Hoersken Closes #xxxx
b1a0119 to
5755e3d
Compare
|
job Does that mean this is fixed or should I be checking a different job? |
the code is already in a WIN32 section
|
Thanks a lot @jay. Yeah, all of the msys v1 jobs experienced the slowness issue, so if one is fixed, all of them should be. Will you take it completely from here? |
Yup np. I'm going to wait on the CI for this last fixup to complete then I will land it. |
|
Sounds good, thanks again. |
|
CI looks good to me. I think this is ready for merge. |
curl no longer supports old mingw. Follow-up to 3802910 curl#11625 Follow-up to 856b133 curl#9412
curl no longer supports old/legacy/classic mingw. This mitigation was addressing slow perf seen in CI with old mingw. The slow perf is not seen in current CI with supported compilers. Remove the duplicate DLL load function from libtest. It's no longer used after this patch. Current CI run times for test3026 on GHA/windows: ``` test 3026...[curl_global_init thread-safety] # mingw, CM clang-x86_64 gnutls libssh -------e--- OK (1715 out of 1738, remaining: 00:02, took 0.196s, duration: 02:55) # dl-mingw, CM 9.5.0-x86_64 schannel -------e--- OK (1554 out of 1577, remaining: 00:02, took 0.217s, duration: 02:29) # msvc, CM x64-windows schannel +examples -------e--- OK (1578 out of 1601, remaining: 00:02, took 0.205s, duration: 02:50) ``` Follow-up to 3802910 #11625 Follow-up to 856b133 #9412 Ref: #17413 Closes #17414




Right now the test seems to take between 15 and 20 minutes
on our Windows CI environments. Which is quite extreme for
just 100 threads, but anyway 42 looks like a better number.