Refactor connection handler by arthurnn · Pull Request #24844 · rails/rails

arthurnn · 2016-05-04T05:55:10Z

### Summary

ConnectionHandler will not have any knowledge of AR models now, it will
only know about the specs.
Like that we can decouple the two, and allow the same model to use more
than one connection.
### Context

Historically, folks used to create abstract AR classes on the fly in
order to have multiple connections for the same model, and override the
connection methods.

With this, now we can override the `specificiation_id` method in the
model, to return a key, that will be used to find the connection_pool
from the handler.

@jeremy thoughts?
### How can users extend this?

database.yml

```
production:
  database: ...
readonly:
  database: ...
```

```
class User < ApplicationRecord
  self.connection_specification_name = "readonly"
end
```

or

```
class User < ApplicationRecord
  def self.connection_specification_name 
     @in_readonly_mode ? "readonly" : "primary"
  end
end
```

In this example `in_readonly_mode` could be an ivar that is set at runtime to change the connection at runtime.

Pretty much the gist of this is, models can now extend that method, and change the connection it will be used. This is possible because now the connection handler doesnt hash the connections by the class name anymore.
### Is this backwards compatible?

Yes, 
- The default `connection_specification_name` behaviour is to fallback to the parent method, if `connection_specification_name` not defined.
- When doing an establish_connection in a AR model: ie: `User.establish_connection()`, that will set the `connection_specification_name` to the model name.

arthurnn · 2016-05-04T05:58:55Z

activerecord/lib/active_record/connection_handling.rb

We need to do some gymnastics here to be backwards compatible.
However if we deprecate the establish_connection from the model, eventually we would be able to remove this fallback into the Class.name method.

The way I envision this, is we would establish the connection straight to the handler, as this connection is not attached to the model anymore.

Ignore this comment.

kaspth · 2016-05-04T21:11:25Z

activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb

Could spare an array allocation by calling each_value if Concurrent::Map supports it 😁

arthurnn · 2016-05-04T22:16:14Z

@tenderlove @matthewd @rafaelfranca would ❤️ your review here too.

matthewd · 2016-05-05T15:27:06Z

Doesn't this break the current inheritance behaviour?

arthurnn · 2016-05-05T19:46:31Z

Doesn't this break the current inheritance behaviour?

It should not. the specification_id method fallbacks to the parent if not set. Which brings the same inheritance behaviour as before. Also, we had tests for that, and they still work.

kaspth · 2016-05-05T20:02:44Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

Should we put id ahead of the pack here as well?

ConnectionHandler will not have any knowlodge of AR models now, it will only know about the specs. Like that we can decouple the two, and allow the same model to use more than one connection. Historically, folks used to create abstract AR classes on the fly in order to have multiple connections for the same model, and override the connection methods. With this, now we can override the `specificiation_id` method in the model, to return a key, that will be used to find the connection_pool from the handler.

arthurnn · 2016-05-05T20:36:54Z

@matthewd @kaspth thanks for bearing with me on this. I renamed the key to specification_name.

vipulnsward · 2016-05-06T16:05:38Z

activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb

+      def retrieve_connection_pool(spec_name)
+        owner_to_pool.fetch(spec_name) do
+          if ancestor_pool = pool_from_any_process_for(spec_name)
+            # A connection was established in an ancestor process that must have


After PR rails#24844 the documentation for `#retrieve_connection_pool` was out of date. This commit changes: - the reference from `@class_to_pool` to `@owner_to_pool`. - with newer Rubies, `#fetch` isn't significantly slower than `#[]`. Since Rails 5 requires Ruby >= 2.2.2, we can just use `#fetch` here.

See rails/rails#24844 for the changes on Rails.

Follow up to rails#20818. `retrieve_connection` is passed `spec_name` instead of `klass` since rails#24844.

GreenFomo · 2016-10-21T20:24:32Z

@arthurnn sorry for commenting on a closed issue, but looking through this thread + the changes in the code, I'm not sure I understand how I can make use of these changes.

Basically, my scenario is that I would like to be able to have multiple databases configured at runtime (e.g. new user signs up -> make the Rails app "aware" of a new database -> make a new Pool for that database), and I can switch on each request which database to use.

To my understanding, this PR would make the above use-case possible, including using a multi-threaded server like Puma - it seems though that it eludes me how I can actually define the "specifications" for each DB. Could you please give me some pointers in the right direction?

To summarize:

Having connection pools to multiple DBs - is it thread safe? Can I safely switch between DB specs using connection_specification_name per request in a multi-threaded server?
How do I specify the .. connection specifications? Is it possible to do so while the Rails app is running or does it need to restart? :)

Thanks a lot!

arthurnn · 2016-10-26T20:02:32Z

Hey @rzv01 I am glad you asked. I am planning to write a few blog posts on how people can use this, but I didn't have time yet. I did a talk that touch into this changes too, here is the link: https://youtu.be/-41V0qETxmE?list=PLo3sQWbYZbskTlzH7Ert-FucR7QmIuwE5

Having connection pools to multiple DBs - is it thread safe? Can I safely switch between DB specs using connection_specification_name per request in a multi-threaded server?

The idea is to not use the same connection pool for different database. Each database should have its own connection pool.

How do I specify the .. connection specifications? Is it possible to do so while the Rails app is running or does it need to restart? :)

You can create the connection pools passing a specification name, and in your model you would extend the method connection_specification_name and return the right name.

GreenFomo · 2016-10-27T08:07:30Z

Hey @arthurnn - thanks a lot for your response. I've watched the video you linked - very informative, thanks! and it got me closer to understanding how multiple DBs should work.

Now, in your talk you mentioned that the future direction for Rails 5.1 and onward would be to have all the DB configs in the database.yml file and to initialize all connections at app startup.

However, this means the config is now static. Would it be possible to make the app "aware" of a new DB config during runtime?

Not sure how this would work to be honest, maybe write to the database.yml file (say, an external process which knows when a new DB is needed) and send a signal to the Rails server process? which would in turn reload the database.yml, which contains the new DB config as well.

The use case I'm thinking of is:

one DB per customer
one/few Rails app instances (each one serving all/multiple clients)
Rails app connects to all/multiple DBs that is assigned to it

here comes the problem:
4. customer signs up, a DB is created for him/her.

How would the Rails app know about the new DB? :)

Ideally, without having to restart the app - I guess it wouldn't be the end of the world, especially with a phased restart of a multi-threaded server, but still, not that elegant, hence me asking if a new DB config can be "injected" in the app dynamically.

Btw, sorry if I missed something and the answer to my question above is obvious!

Thanks a lot, once again! 🍻

milep · 2016-10-27T09:02:49Z

I'm also interested in use case where there is one DB per user/customer.

mikecmpbll · 2016-10-27T09:34:23Z

the database connection details don't technically need to be in database.yml, you should be able to create connection pools yourself; the code could be something like this (untested):

def self.connection_specification_name
  if !connection_handler.retrieve_connection_pool(Thread.current['customer_db_name'])
    spec = database_spec.merge(name: Thread.current['customer_db_name'])
    connection_handler.establish_connection(spec)
  end

  Thread.current['customer_db_name']
end

only looked briefly though, and i might be missing something basic there :). i assume, as we're creating a pool per database we'll end up with pool_size * customers database connections. (traditional pg schema based switching or mysql use statement switching uses one pool)

edit: would probably be better do do the connection pool initialisation out of band though (per request?) as this method will be called a ton 😅

arthurnn · 2016-10-28T02:37:43Z

@mikecmpbll is right.
You can call connection_handler.establish_connection yourself and pass a spec. That spec needs to be a hash, which can come from anywhere, a file, db, http, etc.
So in your case, you would establish the connection, and indeed, probably better to do that during initialization, or out-of-band somehow.

GreenFomo · 2016-10-28T07:10:05Z

@mikecmpbll @arthurnn thanks a ton for your help, you guys are great :)

I think right now i got all the info i need. I'll give it a try sometime in the next few days and see how it goes.

milep · 2016-12-27T07:13:01Z

Is there a thread safe way to remove idle connection pools? With some testing it looks like I just can't use ConnectionHandler's remove_connection, because it removes pool from all threads?

njakobsen · 2018-02-05T22:02:35Z

@arthurnn Is there a way to avoid using an instance variable to store @connection_specification_name in remove_connection (see 537a342)? I need to dynamically swap the connection per-model, in a multi-threaded environment. Sometimes this means the same model, usually ActiveRecord::Base, will be connected to separate databases in separate threads. I achieve this by overriding the connection_specification_name method suggested in #24844 (comment) (though with a different, threadsafe implementation). However, the remove_connection method accesses the instance variable directly, bypassing the logic in the overridden method, and becoming un-threadsafe since the instance variable is shared across threads.

jeremy · 2018-02-05T22:08:23Z

Cool @njakobsen. Please do contribute a PR to no longer refer to the ivar directly. And a PR with your dynamic connection swapping would be wonderful as well! 😍

njakobsen · 2018-02-05T22:20:58Z

I'm definitely just crashing around, but I'll show you my monkeypatch and maybe you can let me know if there's a better way to store information about which connection the model should use on a per-thread basis. See https://github.com/culturecode/stagehand/blob/5c2d1815738e602584633ef48c7aa2b62ba79ce6/lib/stagehand/active_record_extensions.rb#L28-L53. With some feedback, I'll see what sort of PR I can make from all of this.

matthewd · 2018-02-05T22:35:39Z

Do you have a use case that involves calling remove_connection, or are you just trying to cover the full API?

njakobsen · 2018-02-05T22:44:13Z

I don't have a use case for removing the connection, but if someone using my patch were to call establish_connection on the model, it would call remove_connection, so I want to make sure that it uses the overridden method to know which connection to remove based on the current thread. Otherwise I think it'll just default to AR:B's "primary" connection since no other classes are directly setting the instance variable in my monkeypatch.

Basically @arthurnn's suggestion of how his PR can be used opened up a nice avenue for my Gem to use a lighter touch to swap connections dynamically by changing the connection spec name instead of calling establish_connection each time, but it's being hamstrung by the ivar in the remove_connection method.

njakobsen · 2018-02-05T22:48:39Z

I am also not happy with my solution of storing the connection spec name for each model on Thread.current. So if there are any suggestions of a better place for it, I'd happily incorporate that into a potential PR.

mikecmpbll · 2018-02-06T09:20:59Z

i think establish_connection calls this remove_connection

matthewd · 2018-02-06T13:58:23Z

If you've overwritten the getter/setter, then @connection_specification_name won't be defined anyway, will it?

I think it's pretty common to need to thread-scope the behaviour.. the only quick observation I'd make is that you don't necessarily need to use the setter: you could define connection_specification_name to directly read the top of the ConnectionStack, for example.

njakobsen · 2018-02-06T16:45:17Z

@matthewd that's an interesting idea, but I was trying to keep the changes to AR:B isolated from the inner machinations of the rest of the gem. If I could simply make it threadsafe, then any other code (including subclasses) calling establish_connection or remove_connection would continue to function as expected.

My gem seems to work with my monkeypatches, but at @jeremy's urging, my questions are now more about how to apply this threadsafety to a PR for Rails itself.

Does Rails have any mechanism other than Thread.current for storing information per-thread (I see some places using Concurrent::Map, and Process.pid, so maybe something similar would be useful here)
Do the threads ever get reaped? Or is that dependent on which webserver we're running?
Have I missed something important?

njakobsen · 2018-02-07T19:05:19Z

@jeremy @matthewd I think I'm going to tackle this with a very light touch. Instead of making a bunch of decisions about how the connection should be tracked independently per-model, per-thread (which could cause issues in the general case where we don't want independent tracking per thread). I'll make a PR which simply uses the connection_specification_name method when removing a connection, instead of the ivar, while at the same time keeping in mind the change from this commit 537a342. This way, an app that needs independent tracking per thread can implement it by overriding the connection_specification_name with whatever logic is required by their specific app.

arthurnn reviewed May 4, 2016
View reviewed changes

maclover7 added the activerecord label May 4, 2016

kaspth reviewed May 4, 2016
View reviewed changes

kaspth reviewed May 5, 2016
View reviewed changes

arthurnn added 10 commits May 5, 2016 15:29

Rename method

314cf03

Better specification_id method

c1bc0d8

Use spec key, when given as spec_id

79154a3

Add spec_id tests

ac1c4e1

fix test

9872905

Test to swap connection at runtime

c1ae7ec

Better code readability

40a8d2f

inline retrive_conn_pool method

7d9d076

Retrive the right pool for db tasks

34856ba

arthurnn force-pushed the arthurnn/conn branch from d4f0acd to e954ee1 Compare May 5, 2016 20:29

s/specification_id/specification_name

598e7c9

arthurnn force-pushed the arthurnn/conn branch from e954ee1 to 598e7c9 Compare May 5, 2016 20:40

vipulnsward reviewed May 6, 2016
View reviewed changes

bquorning mentioned this pull request May 20, 2016

Update comment for #retrieve_connection_pool #25085

Merged

bquorning added a commit to zendesk/active_record_host_pool that referenced this pull request Jun 28, 2016

Fix AR ConnectionHandler override

4df8be9

See rails/rails#24844 for the changes on Rails.

kamipo added a commit to kamipo/rails that referenced this pull request Jul 17, 2016

Fix payload[:class_name] to payload[:spec_name]

38cd2aa

Follow up to rails#20818. `retrieve_connection` is passed `spec_name` instead of `klass` since rails#24844.

kamipo mentioned this pull request Jul 17, 2016

Fix payload[:class_name] to payload[:spec_name] #25868

Merged

mikecmpbll mentioned this pull request Aug 11, 2016

Threadsafe connection-based switching & adaptive adapters influitive/apartment#316

Closed

jrafanie mentioned this pull request Nov 23, 2016

Increased DB connection pool from 8 to 11 for websocket worker ManageIQ/manageiq#12572

Closed

matthewd mentioned this pull request Feb 21, 2017

Allow 3-level DB configs to group connections by environment #28095

Merged

3 tasks

matthewd mentioned this pull request Mar 9, 2017

Deprecate passing a connection to Migration.check_pending! #28294

Closed

robin850 mentioned this pull request Jun 5, 2017

Establish a connection with all available configurations #29360

Closed

hadashiA mentioned this pull request Sep 7, 2017

Using AR::Base.connection_specification_name mewlist/sengiri#24

Merged

Conversation

arthurnn commented May 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arthurnn May 4, 2016

Choose a reason for hiding this comment

Uh oh!

arthurnn May 4, 2016

Choose a reason for hiding this comment

Uh oh!

kaspth May 4, 2016

Choose a reason for hiding this comment

Uh oh!

arthurnn commented May 4, 2016

Uh oh!

matthewd commented May 5, 2016

Uh oh!

arthurnn commented May 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaspth May 5, 2016

Choose a reason for hiding this comment

Uh oh!

arthurnn commented May 5, 2016

Uh oh!

vipulnsward May 6, 2016

Choose a reason for hiding this comment

Uh oh!

GreenFomo commented Oct 21, 2016

Uh oh!

arthurnn commented Oct 26, 2016

Uh oh!

GreenFomo commented Oct 27, 2016

Uh oh!

milep commented Oct 27, 2016

Uh oh!

mikecmpbll commented Oct 27, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arthurnn commented Oct 28, 2016

Uh oh!

GreenFomo commented Oct 28, 2016

Uh oh!

milep commented Dec 27, 2016

Uh oh!

njakobsen commented Feb 5, 2018

Uh oh!

jeremy commented Feb 5, 2018

Uh oh!

njakobsen commented Feb 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthewd commented Feb 5, 2018

Uh oh!

njakobsen commented Feb 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njakobsen commented Feb 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikecmpbll commented Feb 6, 2018

Uh oh!

matthewd commented Feb 6, 2018

Uh oh!

njakobsen commented Feb 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njakobsen commented Feb 7, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

arthurnn commented May 4, 2016 •

edited

Loading

arthurnn commented May 5, 2016 •

edited

Loading

mikecmpbll commented Oct 27, 2016 •

edited

Loading

njakobsen commented Feb 5, 2018 •

edited

Loading

njakobsen commented Feb 5, 2018 •

edited

Loading

njakobsen commented Feb 5, 2018 •

edited

Loading

njakobsen commented Feb 6, 2018 •

edited

Loading