new database.yml format by arthurnn · Pull Request #27611 · rails/rails

arthurnn · 2017-01-09T00:33:00Z

Summary

New database.yml format. This is how it looks:

default: &default
  adapter: sqlite3
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  timeout: 5000

development:
  primary:
    <<: *default
    database: db/development.sqlite3
  readonly:
    <<: *default
    database: db/readonly.sqlite3

We are moving from a two levels config to a three levels.

This is backward compatible, apps using the default, old, 2 levels database.yml will still work.

Advantages of this change

Better structure on the database.yml file when dealing with multiples databases.
One environment can have multiple configs.
Having all the configs under an environment key, enables us to create/drop multiple databases on the create/drop tasks.
Removes the necessity of adding the Rails.env when calling establish_connection for a non standard connection: i.e: establish_connection "#{Rails.env}_readonly

Implementation details

The config file might have 3 levels, however, ActiveRecord now has a local_configurations option, which is scoped by the running Rails.env. That's why the code on ConnectionSpecification::Resolver didn't change much. Internally we still handle a 2 levels config Hash.
However, we are keeping the configurations getter/setter to be backward compatible and to be able to access it at the tasks level. (We might be able to deprecate this moving forward, that's another discussion tho)
The good thing about this, is that I was able to remove the Rails.env call from inside AR connection code. Now AR won't depend on the RAILS_ENV anymore.

Caveats

If you were using a non-default database.yml, for instance:

development:
  adapter: sqlite3
development_readonly:
  adapter: sqlite3

There will be some necessary changes in the app, as AR will scope the config by the environment, so AR would only see:

development:
  adapter: sqlite3

The upgrade path would be to move the development_readonly config to inside development and reference it when calling establish_connection. (*see the third task on future work)

Future work

Change the guides.
~~Change the default database.yml template~~
Connect to all databases under the env during Rails initializer, so apps won't need to call establish_connection(:readonly) for instance.
Deprecate create:all/drop:all and make create/drop create all the dbs under the development/test envs?
~~Deprecate the old template/config with a warning message?~~

All the future work are not in this PR, because I would like to discuss one by one before implementing them, to make sure they make sense.

review @kaspth @dhh @rafaelfranca @tenderlove @jeremy
@schneems (specially on commit e51edd4, because it changes a lot from 6cc0367, However, I made sure the core functionality still works, which is make DATABASE_URL work without Rails.)

kirs · 2017-01-09T03:03:22Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

I don't think we should call it a legacy config. It's still a valid config that most of Rails apps are going to use, as long as we don't deprecate one level config.

matzke · 2017-01-09T08:13:40Z

+1 for decoupling Rails.env call from inside AR connection code

-1 for 3 level configuration as a "default", because for simple projects setup is imho way too complex. if 3 levels work that's fine, but "default" should be the simple version

siegy22 · 2017-01-09T13:52:45Z

Just a side note:

Why do you use ENV.fetch("RAILS_MAX_THREADS") { 5 } instead of ENV.fetch("RAILS_MAX_THREADS", 5)? 😁

dhh · 2017-01-09T14:11:44Z

Like the new format, but concur with @matzke that most apps won't need multiple DBs starting out, so the default database.yml should still be a standard 2-level setup. Then we can explain the 3-level setup in the guide. So that informs that whatever we call the code isn't about "legacy", it's about 2-level depth or 3-level depth.

arthurnn · 2017-01-09T14:30:36Z

👍 thanks @kirs @matzke and @dhh for the comment on the legacy naming.
I, indeed, thought about it multiple times. I agree that 98% of the rails apps won't probably need the 3 level connections, so let's not call it legacy, and make it live in our code base as a proper 'transformer' stage.

kaspth

Some initial rough thoughts and suggestions 😊

kaspth · 2017-01-09T20:13:15Z

activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb

I find local_configurations confusing: every config is defined locally in database.yml.

Personally, I'd keep this all in configurations and turn it into an object that handles mapping between 2 or 3 level hashes:

class Configurations < Hash def initialize(hash) @configs = normalize_nesting(hash) end def [](env_or_spec_name) # If we couldn't normalize it properly we could handle both env and spec names here. end private def normalize_nesting(hash) # Insert Arthur Magic… end end

👍 agree with Kasper, the naming here is difficult to understand; it took me several read throughs to understand the difference between local_configurations and configurations 😬

kaspth · 2017-01-09T20:26:29Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

Introduces yet another name for a configuration: main. Think we need to prune the terminology :)

kaspth · 2017-01-09T20:30:58Z

activerecord/lib/active_record/core.rb

Active Record

kaspth · 2017-01-09T20:32:16Z

activerecord/lib/active_record/core.rb

But it knows lodge like no other 🤓

kaspth · 2017-01-09T20:40:07Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

Don't think we need these two new lines :)

kaspth · 2017-01-09T20:42:32Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

Think I finally get why "connection specification name" has bothered me so. Partly because it's so abstract, but mainly it's because Active Record just calls these configurations and that's simpler to grasp.

So while I think spec or connection specification is fine internally, I think we should try using configuration as the public terminology.

👍 Agree @kaspth ,

So the model using a readonly config could look like this:

class User < ApplicationRecord self.connection_configuration = :readonly end

or:

class User < ApplicationRecord def self.connection_configuration Application.readonly_mode? ? :readonly : :primary end end

Pretty much we could replace the connection_specification_name to connection_configuration in the model.

I don't want to change those in this PR tho, we can try in a next one. Also there will be some deprecation cycle for the name change, as 5.0 uses connection_specification_name

kaspth · 2017-01-09T20:52:22Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

Why do we need to know the primary wording internally? Shouldn't establish connection just handle a config without a name?

kaspth · 2017-01-09T20:55:09Z

activerecord/lib/active_record/core.rb

New lines around here.

kaspth · 2017-01-09T20:56:18Z

activerecord/lib/active_record/tasks/database_tasks.rb

Think it's a smell that the database tasks have to reach for the ConnectionAdapters::ConnectionSpecification::LegacyConfigTransformer.

schneems · 2017-01-13T16:40:56Z

I like the idea of being able to specify a readonly shard and have AR aware of it out of the box. I'm guessing this is what a recommended Heroku config would look like:

production:
  primary:
    url: <%= ENV['DATABASE_URL'] %>
  readonly:
    url: <%= ENV['HEROKU_POSTGRESQL_BLACK_URL'] %>

It looks like some tests were moved but no behavior changes were made.

Questions

Are we ever interested in connecting to multiple "readonly" databases? For example if we wanted to round-robin reads on say 2 or 10 "readonly" shards? I'm not sure if such behavior would be in scope.

Thinking about it now, i guess you could hack it in on a per webserver basis where each connects to a random readonly server

production:
  primary:
    url: <%= ENV['DATABASE_URL'] %>
  readonly:
    url: <%= [ ENV['HEROKU_POSTGRESQL_ROSE_URL'],
               ENV['HEROKU_POSTGRESQL_BLACK_URL'],
               ENV['HEROKU_POSTGRESQL_BLUE_URL']
             ].sample %>

arthurnn

I clean-up this a bit. There is no legacy reference anymore, as we will support 2 level configs moving forward.
@kaspth I liked your suggestion on having a Configuration class, that's what I did.
However, I didn't want to extend Hash, and instead, I did a Proxy. (http://words.steveklabnik.com/beware-subclassing-ruby-core-classes)
Also, I cannot normalize the config in the initializer, as AR::Base.configurations is a public API, and people could be using that configuration for something. so instead I just created a normalize method, that will return the config the way the resolver expects.

Thoughts?

arthurnn · 2017-01-15T04:19:50Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

👍 Agree @kaspth ,

So the model using a readonly config could look like this:

class User < ApplicationRecord self.connection_configuration = :readonly end

or:

class User < ApplicationRecord def self.connection_configuration Application.readonly_mode? ? :readonly : :primary end end

Pretty much we could replace the connection_specification_name to connection_configuration in the model.

I don't want to change those in this PR tho, we can try in a next one. Also there will be some deprecation cycle for the name change, as 5.0 uses connection_specification_name

With .local_configurations we don't need to lookup the connection config using an enviroment. This opens the door for us to have a 2 levels config under an enviroment, to be able to make ActiveRecord work with multiple connections.

With the `DATABASE_URL` logic living in the connection config resolver, we don't need to mess around with enviroment(such as RAILS_ENV, RACK_ENV) in it. Because Active Record will now use the `local_configurations`, which is a configuration hash for the running environment, the `DATABASE_URL` logic gets easier to handle as it only needs to set the `url` config for the `primary` database config, which is the default one.

As we pass the local configuration to ActiveRecord, we don't need the `Rails.env` anymore to find the right configuration.

also add tests

also make the default establish_connection call use :primary

Replace the LegacyTransformer, with a better configuration class. We will still support 2 level configurations for apps that won't connect to multiple databases, so we should not call legacy. Instead, we can normalize the configuration before passing to the resolvers.

arthurnn · 2017-01-15T05:42:21Z

@matthewd: As we discussed. I changed the PR to make this possible:
database.yml

production:
  database: ...
production_readonly:
  database: ...

So, old setups, with a 2 levels config, will still work, and will still be able to do a establish_connection(:production_readonly)

kirs · 2017-01-15T16:13:03Z

Do we have any plan about how this works with migrations?

I see at least two scenarios:

You have primary and readonly connections and you want to use the first one to run the migrations.
You have different connections with different databases. In our case, master and shard_X databases have completely different tables. For every migration, we have a method that describes which database the migration is going to alter.

kaspth · 2017-01-15T19:45:29Z

@kirs I think bin/rails g migration some_name --db readonly with self.connection = :readonly in the migration would suffice, but maybe we should shelve that discussion for later. Think this PR has big implications as is 😊

kaspth

Some more feedback ❤️

kaspth · 2017-01-15T19:48:40Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+          config = config ? config.dup : {}
+
+          # if the configuration is a one level config, pushes that to be a two level config, with primary as key
+          if !config.key?("primary") && config.key?("adapter")


Maybe instead of checking for an adapter key we should check config.values.any? { |v| !v.is_a?(Hash) }?

kaspth · 2017-01-15T19:50:12Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

-              resolve_connection(config).merge("name" => spec.to_s)
-            else
-              raise(AdapterNotSpecified, "'#{spec}' database is not configured. Available: #{configurations.keys.inspect}")
+            config = configurations.respond_to?(:normalized) && configurations.normalized[spec.to_s]


When won't configurations respond to normalized? Think this might be too protective if it guards against people overriding ActiveRecord::Base.configurations to return whatever.

Getting rid of that check also means we can slim the diff and keep the previous easier-to-read structure of the method as a bonus 😁

kaspth · 2017-01-15T19:52:14Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+        # Accepts:
+        # - Hash: one layer deep Hash that contains all connection information
+        # - Symbol: a configuration name that will be looked-up
+        #           from the configurations Hash


Maybe we should leave out that configurations is a Hash 😉

kaspth · 2017-01-15T19:53:13Z

activerecord/lib/active_record/core.rb

-      self.configurations = {}

-      # Returns fully resolved configurations hash
+      self.configurations = {}


Let's keep the previous newline below and remove the one above :)

kaspth · 2017-01-15T19:55:25Z

activerecord/lib/active_record/railties/databases.rake

        should_reconnect = ActiveRecord::Base.connection_pool.active_connection?
        ActiveRecord::Schema.verbose = false
-        ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Base.configurations["test"], :ruby, ENV["SCHEMA"]
+        ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Tasks::DatabaseTasks.config_at("test"), :ruby, ENV["SCHEMA"]


Do these changes mean we've broken backwardscompatibility?

Looks like it indeed broke.

adding it back

kaspth · 2017-01-15T20:07:55Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+
+        def normalized(key = @root_level)
+          config = key ? self[key] : self
+          config = config ? config.dup : {}


Find these two lines hard to parse because of the two conditionals. The empty hash fallback seems too defensive at first sight. What case is it trying to cover? That there's a key in self[key] but with no config at all?

Could we simplify to this?

config = fetch(key, self).dup

kaspth · 2017-01-15T20:10:13Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+          end
+
+          if url = ENV["DATABASE_URL"]
+            config["primary"] ||= {}


We've just spent lines establishing "primary", surely it should have been made an empty hash by now?

kaspth · 2017-01-15T20:15:30Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

          if config
            resolve_connection config
-          elsif env = ActiveRecord::ConnectionHandling::RAILS_ENV.call
-            resolve_symbol_connection env.to_sym


Are we removing this because resolve_connection calls it?

kaspth · 2017-01-15T20:17:07Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

-            config[key] = resolve(value) if value
-          end
-          config
-        end


This makes me wonder how much we need to worry about people calling all the available Hash methods on ActiveRecord::Base.configurations[some_env] and if we can just normalize the configurations up front.

kaspth · 2017-01-15T20:26:05Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

          end

-          # Takes the environment such as +:production+ or +:development+.
+          # Takes the specification name such as +:primary+.


configuration name 😉

rafaelfranca · 2017-01-18T04:45:52Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

-        #   config = { "production" => { "host" => "localhost", "database" => "foo", "adapter" => "sqlite3" } }
-        #   spec = Resolver.new(config).spec(:production)
+        #   config = { "host" => "localhost", "database" => "foo", "adapter" => "sqlite3" }
+        #   spec = Resolver.new({}).spec(config)


This API looks weird to me. I'd expect the resolver to always be initialized with the config and resolve a spec.

rafaelfranca · 2017-01-18T04:49:02Z

activerecord/lib/active_record/railties/databases.rake

        should_reconnect = ActiveRecord::Base.connection_pool.active_connection?
        ActiveRecord::Schema.verbose = false
-        ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Base.configurations["test"], :ruby, ENV["SCHEMA"]
+        ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Tasks::DatabaseTasks.config_at("test"), :ruby, ENV["SCHEMA"]


Looks like it indeed broke.

rafaelfranca · 2017-01-18T04:51:19Z

activerecord/test/cases/tasks/database_tasks_test.rb

-        "development" => { "database" => "dev-db" },
-        "test"        => { "database" => "test-db" },
-        "production"  => { "database" => "prod-db" }
+        "development" => { "database" => "dev-db", "adapter" => "sqlite3" },


Why the adapter is now required?

I think it's because we only check for the adapter key at the moment. Finding a more accurate way to find an actual config could fix the need to change this.

It's already required in practice for an actual config you intend to use, and while this test was accidentally assuming it, we've never claimed to be an arbitrary hash storage bucket.

kaspth · 2017-02-18T11:33:24Z

Looks like the backwardscompatibility issues have been solved, so I'd say rebase, fix tests, run rubocop -a, squash and then you have my ok to merge 👍

This was a hard and long haul so I definitely appreciate your willingness to run my review gauntlet so many times 😄

kaspth · 2017-02-18T11:35:51Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+          ret
+        end
+
+        def keys


When is this called? My cmd-E on keys can't find any instances?

kaspth · 2017-02-18T11:58:42Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+        def to_hash
+          @config
+        end
+      end


I still found a lot of this logic hard to read, so I took the liberty of trying to make it more concise:

class ConnectionConfigurations #:nodoc: def initialize(config) @root_config = @config = config @database_url_config = resolve_database_url_config || {} end def scope=(scope) @root_config, @config = @config, (@config[scope] || {}) end def [](key) @database_url_config.reverse_merge(@config[key] || @root_config[key] || {}) end def keys @config.keys end def to_h @config end private def resolve_database_url_config if ENV["DATABASE_URL"] ConnectionUrlResolver.new(ENV["DATABASE_URL"]).to_hash end end end

I'm curious why we need the @root_config[key] fallback in []?

NOTE: scope= basically assumes that it's never assigned more than once, which it doesn't seem to be in library code, but it is in test code. Would it make more sense to pass it in at the constructor? Or have a scope method that returns a new instance of ConnectionConfigurations?

matthewd · 2017-02-18T15:59:24Z

activerecord/lib/active_record/connection_adapters/connection_specification.rb

+          end
+          ret ||= @config[key]
+          if @database_url_config
+            ret = (ret || {}).merge(@database_url_config)


Isn't this applying the URL-supplied database to every configuration we return?

That's wrong, and I'm alarmed the tests aren't point that out. It's hard to see what's going on with them, because they're all renamed at the same time, though. 😟

arthurnn added this to the 5.1.0 milestone Jan 9, 2017

kirs reviewed Jan 9, 2017

View reviewed changes

prathamesh-sonpatki added the activerecord label Jan 9, 2017

kaspth reviewed Jan 9, 2017

View reviewed changes

maclover7 added the needs feedback label Jan 9, 2017

arthurnn commented Jan 15, 2017

View reviewed changes

arthurnn added 16 commits January 14, 2017 23:30

Add local_configurations to AR

3ad0ecc

With .local_configurations we don't need to lookup the connection config using an enviroment. This opens the door for us to have a 2 levels config under an enviroment, to be able to make ActiveRecord work with multiple connections.

Refactoring on resolver test

a01d4b0

Remove RAILS_ENV from active record connection.

6fe4ff9

As we pass the local configuration to ActiveRecord, we don't need the `Rails.env` anymore to find the right configuration.

Move DATABASE_URL resolver test

b44d819

Move LegacyConfigTransformer to connection_spec file

93002ef

also add tests

Update broken namespace + tests

31d8af3

also make the default establish_connection call use :primary

Remove RAILS_ENV call

b60bc24

Update docs and remove un-used method.

d15d3af

Adapt db rake tasks to use new yml config

97adad6

Use config and not env on connect code

ecf7534

Fix broken tests

b993ff8

Be more defensive on legacy config transformer

7ec5e27

Update docs [skip ci]

9a259e6

fix linter

63f084c

arthurnn force-pushed the arthurnn/new_db_yaml2 branch from c6e1735 to 63f084c Compare January 15, 2017 04:31

arthurnn added 2 commits January 14, 2017 23:53

Fix db console

379cb5e

Fix rubocop

423103b

arthurnn added 2 commits January 15, 2017 00:06

More rubocop fixes

756aca6

Make random top level config work on establish_connection

438c09e

kaspth reviewed Jan 15, 2017

View reviewed changes

bf4 mentioned this pull request Jan 16, 2017

Extract common database defaults; better use of YAML #12292

Merged

rafaelfranca reviewed Jan 18, 2017

View reviewed changes

arthurnn added 5 commits February 17, 2017 19:52

Refactor connection configuration code

c565a70

Make establish_connection nil work with default config

f1671e2

Fix comments

58cee42

revert some not needed changes

6c46441

Fix database_url merging code

8505b67

kaspth reviewed Feb 18, 2017

View reviewed changes

matthewd reviewed Feb 18, 2017

View reviewed changes

arthurnn closed this Feb 21, 2017

matthewd mentioned this pull request Feb 21, 2017

Allow 3-level DB configs to group connections by environment #28095

Merged

3 tasks

metaskills mentioned this pull request Jun 7, 2017

Does not work with Rails 5.1 customink/secondbase#44

Closed

Conversation

arthurnn commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Advantages of this change

Implementation details

Caveats

Future work

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matzke commented Jan 9, 2017

Uh oh!

siegy22 commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhh commented Jan 9, 2017

Uh oh!

arthurnn commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaspth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

schneems commented Jan 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Questions

Uh oh!

arthurnn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurnn commented Jan 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kirs commented Jan 15, 2017

Uh oh!

kaspth commented Jan 15, 2017

Uh oh!

kaspth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

arthurnn commented Jan 9, 2017 •

edited

Loading

siegy22 commented Jan 9, 2017 •

edited

Loading

arthurnn commented Jan 9, 2017 •

edited

Loading

schneems commented Jan 13, 2017 •

edited

Loading

arthurnn commented Jan 15, 2017 •

edited

Loading