Fix parallel tests port in use#9661
Conversation
andreslucena
left a comment
There was a problem hiding this comment.
It's a tough one to test as it happens randomly but if you really want to test it out, run some system specs in parallel for about 10 000 times and you should hit this edge case with a high probability. As stated, on average it should happen about once every 2000 runs when the tests are run in parallel.
🤣
Another way to test it would be to reserve ports 5000-6995 with some other script and then start the tests in parallel with a processor count of 4 (PARALLEL_TEST_PROCESSORS=4 bundle exec rake parallel:create parallel:migrate parallel:spec[^spec/system]). This way you should expect to see the failure with the original code just after few runs.
If anyone else want to try this out, mind that the current instructions are a bit different. They are found at our docs.
My quick and dirty script to try this out:
require "socket"
# ignore port for postgres and redis
ignore_ports = [5432, 5433, 6379]
def listen_server(server)
client = server.accept
client.puts "HTTP/1.1 200 Success"
client.puts ""
client.puts("Listening")
client.puts(Time.now.ctime)
client.close
listen_server(server)
end
(5000..6995).each do |port|
next if ignore_ports.include?(port)
puts "Listenting on port #{port}"
Thread.new do
server = TCPServer.open(port)
listen_server(server)
end
end
puts "To terminate use CTRL+c"
sleep 60 * 60 * 1So, tried it locally and works as advertised. Thanks! 👍🏽
* develop: (24 commits) Add develop index to the documentation (#9666) Fix initiatives components (#9633) Fix conference speaker avatars (#9643) Update `rokroskar/workflow-run-cleanup-action` GitHub action to v0.3.3 (#9750) Fix character counter for the WYSIWYG editor (#9680) Fix posting comments before the initial load has run (#9614) Fix parallel tests port in use (#9661) Split parallel test coverage reports into their own folders (#9686) Improve admin panel user experience regarding title links and order of actions (#9496) Fix title and description too long in initiatives spec sometimes (#9648) Fix API GraphiQL system spec with newer ChromeDriver (#9642) Add missing character on code block (#9798) Fix hidden error messages on the registration form (#9625) Add documentation about configuring ActiveStorage / dynamic file uploads (#9777) Add documentation section about customizing cells (#9622) Fix hashtags not recognized at the beginning of the string (#9616) Fix version pages showing a HTTP 500 error when the version does not exist (#9615) Fix multitenant organizations stats cache (#9605) Prevent the account edit route through Devise (#9611) Fix iframe disabling producing invalid HTML (#9685) ...
🎩 What? Why?
When the tests are run in parallel, they can occasionally fail because multiple servers are started and the port might be already in use that is assigned to Capybara.
This conflict can happen randomly when two consecutive calls to
rand(5000..6999)return exactly the same number which I understand can happen once every 1999 runs.This PR provides a fix for this edge case which should allow the parallel tests to always be assigned a port that is available.
When this happens, the following kind of errors appear in the CI logs:
Testing
It's a tough one to test as it happens randomly but if you really want to test it out, run some system specs in parallel for about 10 000 times and you should hit this edge case with a high probability. As stated, on average it should happen about once every 2000 runs when the tests are run in parallel.
Another way to test it would be to reserve ports 5000-6995 with some other script and then start the tests in parallel with a processor count of 4 (
PARALLEL_TEST_PROCESSORS=4 bundle exec rake parallel:create parallel:migrate parallel:spec[^spec/system]). This way you should expect to see the failure with the original code just after few runs.Note that if you try the same with the new code, it can still fail due to timing out when there are not enough ports available. But in the CI this should never happen as these ports are expected to be available.
When the tests are not run in parallel, this never happens (unless you happen to have some other service that has reserved the ports on the machine).