Ensure test threads share a DB connection#28083
Conversation
This ensures multiple threads inside a transactional test to see consistent database state. When a system test starts Puma spins up one thread and Capybara spins up another thread. Because of this when tests are run the database cannot see what was inserted into the database on teardown. This is because there are two threads using two different connections. This change uses the statement cache to lock the threads to using a single connection ID instead of each not being able to see each other. This code only runs in the fixture setup and teardown so it does not affect real production databases. When a transaction is opened we set `lock_thread` to `Thread.current` so we can keep track of which connection the thread is using. When we rollback the transaction we unlock the thread and then there will be no left-over data in the database because the transaction will roll back the correct connections. [ Eileen M. Uchitelle, Matthew Draper ]
|
Sorry for dropping in with little context, but this seems like it needs to do some additional work to actually ensure the connections aren't used concurrently. While usually the server is only executing if the test is blocked on a request, that's not always the case. Particularly if Javascript is involved, it seems like we could very easily end up with a race condition between the threads. The underlying C structures that the (But also I've not been following super closely since I'm away on leave so feel free to just tell me I'm wrong) |
|
@sgrif https://github.com/rails/rails/pull/28083/files#diff-c226a4680f86689c3c170d4bc5911e96R610 Could do with some rearrangement to make it clearer what's going on, but this was the easiest option for a simple drop-in solution for [almost] all actual-adapter interaction in one go. |
|
(higher level concurrency issues, like one thread working in a transaction, or some other long term statefulness on the connection, are consciously out of scope) |
|
Oh, I get it now. I parsed that line wrong mentally before. Yeah, seems fine then. I do think it would be a good idea to add some more explicit locking throughout the methods higher up in the future. We should also document that the result returned needs to either be thread safe or eagerly buffer since a lazy cursor that isn't thread safe would cause issues. |
…ldren`)
If we run only following tests:
- test/cases/scoping/default_scoping_test.rb
- test/cases/associations_test.rb
```
$ cat Rakefile.test
require "rake/testtask"
ENV["ARCONN"] = "postgresql"
Rake::TestTask.new do |t|
t.libs << "test"
t.test_files = %w(
test/cases/scoping/default_scoping_test.rb
test/cases/associations_test.rb
)
end
```
a test will fail:
```
$ bundle exec rake test -f Rakefile.test
/app/activesupport/lib/active_support/core_ext/enumerable.rb:20: warning: method redefined; discarding old sum
Using postgresql
Run options: --seed 11830
# Running:
.........................................................................................F................
Finished in 6.939055s, 15.2759 runs/s, 27.9577 assertions/s.
1) Failure:
AssociationProxyTest#test_save_on_parent_saves_children [/app/activerecord/test/cases/associations_test.rb:185]:
Expected: 1
Actual: 2
106 runs, 194 assertions, 1 failures, 0 errors, 0 skips
rake aborted!
Command failed with status (1)
/usr/local/bin/bundle:22:in `load'
/usr/local/bin/bundle:22:in `<main>'
Tasks: TOP => test
(See full trace by running task with --trace)
```
In rails#28083, change `self.use_transactional_tests` to `false`
but we forget to clean-up fixture.
However we don't have to disable transaction except a few tests.
In Rails 5.1 transactional tests share the same connection id between the webserver and test runner. This removes the need for special cleanup strategies. This speeds up the tests significantly, before: ``` Finished in 3 minutes 30.3 seconds (files took 5.46 seconds to load) ``` After: ``` Finished in 1 minute 41.61 seconds (files took 5.45 seconds to load) ``` rails/rails#28083
In Rails 5.1 transactional tests share the same connection id between the webserver and test runner. This removes the need for special cleanup strategies. This speeds up the tests significantly, before: ``` Finished in 3 minutes 30.3 seconds (files took 5.46 seconds to load) ``` After: ``` Finished in 1 minute 41.61 seconds (files took 5.45 seconds to load) ``` rails/rails#28083
|
I'd be willing to make a PR to backport this to earlier versions of Rails if there's an interest. |
|
@bf4 I consider this a feature, not a bug fix, so it won't be back ported. It changes expected behavior too much. Sometimes if a bug lives long enough it becomes a feature. That's the case for this change. |
In Rails 5.1 transactional tests share the same connection id between the webserver and test runner. This removes the need for special cleanup strategies. This speeds up the tests significantly, before: ``` Finished in 3 minutes 30.3 seconds (files took 5.46 seconds to load) ``` After: ``` Finished in 1 minute 41.61 seconds (files took 5.45 seconds to load) ``` rails/rails#28083
In Rails 5.1 transactional tests share the same connection id between the webserver and test runner. This removes the need for special cleanup strategies. This speeds up the tests significantly, before: ``` Finished in 3 minutes 30.3 seconds (files took 5.46 seconds to load) ``` After: ``` Finished in 1 minute 41.61 seconds (files took 5.45 seconds to load) ``` rails/rails#28083
In Rails 5.1 transactional tests share the same connection id between the webserver and test runner. This removes the need for special cleanup strategies. This speeds up the tests significantly, before: ``` Finished in 3 minutes 30.3 seconds (files took 5.46 seconds to load) ``` After: ``` Finished in 1 minute 41.61 seconds (files took 5.45 seconds to load) ``` rails/rails#28083
| @query_cache[sql][binds] = yield | ||
| end | ||
| result.dup | ||
| @lock.synchronize do |
There was a problem hiding this comment.
This code only runs in the fixture setup and teardown so it does not
affect real production databases.
@eileencodes @matthewd Is this lock and the lock in abstract_adapter necessary outside of the test environment?
There was a problem hiding this comment.
No.. my theory was that acquiring an uncontended lock wouldn't be noticeably slower than checking whether the lock was needed (given that we're about to perform IO anyway). I guess the fact you're asking suggests I was wrong?
There was a problem hiding this comment.
Yea I started noticing it in our flamegraphs but the overhead doesn't contribute significantly when I tried to benchmark it.
|
I'm using Rails 5.1/Devise/RSpec/Capybara, and the test thread does not appear to be sharing a connection when running a selenium browser. Reference the following test: This passes when |
|
@moveson please open a new issue with a way to reproduce it and demonstrates the failure you're seeing. |
|
@eileencodes This turned out to be a problem with puma running in cluster mode, which made the Capybara thread unable to see the database. The problem has been addressed in a Rails PR here and will hopefully see the light of day in Rails 5.1.5. |
|
Glad to see it helped someone else! |
|
It's a good fix. Also thanks to @twalpole for diagnosing the problem and pointing me to the solution. |
Before rails#34953, when using the `:async` Active Job queue adapter, jobs enqueued in `db/seeds.rb`, such as Active Storage analysis jobs, would cause a hang (see rails#34939). Therefore, rails#34953 changed all jobs enqueued in `db/seeds.rb` to use the `:inline` queue adapter instead. (This behavior was later limited to only take effect when the `:async` adapter was configured, see rails#35905.) However, inline jobs in `db/seeds.rb` cleared `CurrentAttributes` values (see rails#37526). Therefore, rails#37568 changed the `:inline` adapter to wrap each job in its own thread, for isolation. However, wrapping a job in its own thread affects which database connection it uses. Thus inline jobs can no longer execute within the calling thread's database transaction, including seeing any uncommitted changes. Additionally, if the calling thread is not wrapped with the executor, the inline job thread (which is wrapped with the executor) can deadlock on the load interlock. And when testing (with `connection_pool.lock_thread = true`), the inline job thread can deadlock on one of the locks added by rails#28083. Therefore, this commit reverts the solutions of rails#34953 and rails#37568, and instead wraps evaluation of `db/seeds.rb` with the executor. This eliminates the original hang from rails#34939, which was also due to running multiple threads and not wrapping all of them with the executor. And, because nested calls to `executor.wrap` are ignored, any inline jobs in `db/seeds.rb` will not clear `CurrentAttributes` values. Alternative fix for rails#34939. Reverts rails#34953. Reverts rails#35905. Partially reverts rails#35896. Alternative fix for rails#37526. Reverts rails#37568. Fixes rails#40552.
Before rails#34953, when using the `:async` Active Job queue adapter, jobs enqueued in `db/seeds.rb`, such as Active Storage analysis jobs, would cause a hang (see rails#34939). Therefore, rails#34953 changed all jobs enqueued in `db/seeds.rb` to use the `:inline` queue adapter instead. (This behavior was later limited to only take effect when the `:async` adapter was configured, see rails#35905.) However, inline jobs in `db/seeds.rb` cleared `CurrentAttributes` values (see rails#37526). Therefore, rails#37568 changed the `:inline` adapter to wrap each job in its own thread, for isolation. However, wrapping a job in its own thread affects which database connection it uses. Thus inline jobs can no longer execute within the calling thread's database transaction, including seeing any uncommitted changes. Additionally, if the calling thread is not wrapped with the executor, the inline job thread (which is wrapped with the executor) can deadlock on the load interlock. And when testing (with `connection_pool.lock_thread = true`), the inline job thread can deadlock on one of the locks added by rails#28083. Therefore, this commit reverts the solutions of rails#34953 and rails#37568, and instead wraps evaluation of `db/seeds.rb` with the executor. This eliminates the original hang from rails#34939, which was also due to running multiple threads and not wrapping all of them with the executor. And, because nested calls to `executor.wrap` are ignored, any inline jobs in `db/seeds.rb` will not clear `CurrentAttributes` values. Alternative fix for rails#34939. Reverts rails#34953. Reverts rails#35905. Partially reverts rails#35896. Alternative fix for rails#37526. Reverts rails#37568. Fixes rails#40552.
Before rails#34953, when using the `:async` Active Job queue adapter, jobs enqueued in `db/seeds.rb`, such as Active Storage analysis jobs, would cause a hang (see rails#34939). Therefore, rails#34953 changed all jobs enqueued in `db/seeds.rb` to use the `:inline` queue adapter instead. (This behavior was later limited to only take effect when the `:async` adapter was configured, see rails#35905.) However, inline jobs in `db/seeds.rb` cleared `CurrentAttributes` values (see rails#37526). Therefore, rails#37568 changed the `:inline` adapter to wrap each job in its own thread, for isolation. However, wrapping a job in its own thread affects which database connection it uses. Thus inline jobs can no longer execute within the calling thread's database transaction, including seeing any uncommitted changes. Additionally, if the calling thread is not wrapped with the executor, the inline job thread (which is wrapped with the executor) can deadlock on the load interlock. And when testing (with `connection_pool.lock_thread = true`), the inline job thread can deadlock on one of the locks added by rails#28083. Therefore, this commit reverts the solutions of rails#34953 and rails#37568, and instead wraps evaluation of `db/seeds.rb` with the executor. This eliminates the original hang from rails#34939, which was also due to running multiple threads and not wrapping all of them with the executor. And, because nested calls to `executor.wrap` are ignored, any inline jobs in `db/seeds.rb` will not clear `CurrentAttributes` values. Alternative fix for rails#34939. Reverts rails#34953. Reverts rails#35905. Partially reverts rails#35896. Alternative fix for rails#37526. Reverts rails#37568. Fixes rails#40552. (cherry picked from commit 648da12)
This ensures multiple threads inside a transactional test to see consistent
database state.
When a system test starts Puma spins up one thread and Capybara spins up
another thread. Because of this when tests are run the database cannot
see what was inserted into the database on teardown. This is because
there are two threads using two different connections.
This change uses the statement cache to lock the threads to using a
single connection ID instead of each not being able to see each other.
This code only runs in the fixture setup and teardown so it does not
affect real production databases.
When a transaction is opened we set
lock_threadtoThread.currentsowe can keep track of which connection the thread is using. When we
rollback the transaction we unlock the thread and then there will be no
left-over data in the database because the transaction will roll back
the correct connections.
[ Eileen M. Uchitelle, Matthew Draper ]
cc/ @matthewd