Conversation
There was a problem hiding this comment.
I don't think we should call it a legacy config. It's still a valid config that most of Rails apps are going to use, as long as we don't deprecate one level config.
|
+1 for decoupling Rails.env call from inside AR connection code -1 for 3 level configuration as a "default", because for simple projects setup is imho way too complex. if 3 levels work that's fine, but "default" should be the simple version |
|
Just a side note: Why do you use |
|
Like the new format, but concur with @matzke that most apps won't need multiple DBs starting out, so the default database.yml should still be a standard 2-level setup. Then we can explain the 3-level setup in the guide. So that informs that whatever we call the code isn't about "legacy", it's about 2-level depth or 3-level depth. |
kaspth
left a comment
There was a problem hiding this comment.
Some initial rough thoughts and suggestions 😊
There was a problem hiding this comment.
I find local_configurations confusing: every config is defined locally in database.yml.
Personally, I'd keep this all in configurations and turn it into an object that handles mapping between 2 or 3 level hashes:
class Configurations < Hash
def initialize(hash)
@configs = normalize_nesting(hash)
end
def [](env_or_spec_name)
# If we couldn't normalize it properly we could handle both env and spec names here.
end
private
def normalize_nesting(hash)
# Insert Arthur Magic…
end
endThere was a problem hiding this comment.
👍 agree with Kasper, the naming here is difficult to understand; it took me several read throughs to understand the difference between local_configurations and configurations 😬
There was a problem hiding this comment.
Introduces yet another name for a configuration: main. Think we need to prune the terminology :)
There was a problem hiding this comment.
But it knows lodge like no other 🤓
There was a problem hiding this comment.
Don't think we need these two new lines :)
There was a problem hiding this comment.
Think I finally get why "connection specification name" has bothered me so. Partly because it's so abstract, but mainly it's because Active Record just calls these configurations and that's simpler to grasp.
So while I think spec or connection specification is fine internally, I think we should try using configuration as the public terminology.
There was a problem hiding this comment.
👍 Agree @kaspth ,
So the model using a readonly config could look like this:
class User < ApplicationRecord
self.connection_configuration = :readonly
endor:
class User < ApplicationRecord
def self.connection_configuration
Application.readonly_mode? ? :readonly : :primary
end
endPretty much we could replace the connection_specification_name to connection_configuration in the model.
I don't want to change those in this PR tho, we can try in a next one. Also there will be some deprecation cycle for the name change, as 5.0 uses connection_specification_name
There was a problem hiding this comment.
Why do we need to know the primary wording internally? Shouldn't establish connection just handle a config without a name?
There was a problem hiding this comment.
Think it's a smell that the database tasks have to reach for the ConnectionAdapters::ConnectionSpecification::LegacyConfigTransformer.
|
I like the idea of being able to specify a readonly shard and have AR aware of it out of the box. I'm guessing this is what a recommended Heroku config would look like: It looks like some tests were moved but no behavior changes were made. QuestionsAre we ever interested in connecting to multiple "readonly" databases? For example if we wanted to round-robin reads on say 2 or 10 "readonly" shards? I'm not sure if such behavior would be in scope. Thinking about it now, i guess you could hack it in on a per webserver basis where each connects to a random readonly server production:
primary:
url: <%= ENV['DATABASE_URL'] %>
readonly:
url: <%= [ ENV['HEROKU_POSTGRESQL_ROSE_URL'],
ENV['HEROKU_POSTGRESQL_BLACK_URL'],
ENV['HEROKU_POSTGRESQL_BLUE_URL']
].sample %> |
arthurnn
left a comment
There was a problem hiding this comment.
I clean-up this a bit. There is no legacy reference anymore, as we will support 2 level configs moving forward.
@kaspth I liked your suggestion on having a Configuration class, that's what I did.
However, I didn't want to extend Hash, and instead, I did a Proxy. (http://words.steveklabnik.com/beware-subclassing-ruby-core-classes)
Also, I cannot normalize the config in the initializer, as AR::Base.configurations is a public API, and people could be using that configuration for something. so instead I just created a normalize method, that will return the config the way the resolver expects.
Thoughts?
There was a problem hiding this comment.
👍 Agree @kaspth ,
So the model using a readonly config could look like this:
class User < ApplicationRecord
self.connection_configuration = :readonly
endor:
class User < ApplicationRecord
def self.connection_configuration
Application.readonly_mode? ? :readonly : :primary
end
endPretty much we could replace the connection_specification_name to connection_configuration in the model.
I don't want to change those in this PR tho, we can try in a next one. Also there will be some deprecation cycle for the name change, as 5.0 uses connection_specification_name
With .local_configurations we don't need to lookup the connection config using an enviroment. This opens the door for us to have a 2 levels config under an enviroment, to be able to make ActiveRecord work with multiple connections.
With the `DATABASE_URL` logic living in the connection config resolver, we don't need to mess around with enviroment(such as RAILS_ENV, RACK_ENV) in it. Because Active Record will now use the `local_configurations`, which is a configuration hash for the running environment, the `DATABASE_URL` logic gets easier to handle as it only needs to set the `url` config for the `primary` database config, which is the default one.
As we pass the local configuration to ActiveRecord, we don't need the `Rails.env` anymore to find the right configuration.
also add tests
also make the default establish_connection call use :primary
Replace the LegacyTransformer, with a better configuration class. We will still support 2 level configurations for apps that won't connect to multiple databases, so we should not call legacy. Instead, we can normalize the configuration before passing to the resolvers.
c6e1735 to
63f084c
Compare
|
@matthewd: As we discussed. I changed the PR to make this possible: So, old setups, with a 2 levels config, will still work, and will still be able to do a |
|
Do we have any plan about how this works with migrations? I see at least two scenarios:
|
|
@kirs I think |
| config = config ? config.dup : {} | ||
|
|
||
| # if the configuration is a one level config, pushes that to be a two level config, with primary as key | ||
| if !config.key?("primary") && config.key?("adapter") |
There was a problem hiding this comment.
Maybe instead of checking for an adapter key we should check config.values.any? { |v| !v.is_a?(Hash) }?
| resolve_connection(config).merge("name" => spec.to_s) | ||
| else | ||
| raise(AdapterNotSpecified, "'#{spec}' database is not configured. Available: #{configurations.keys.inspect}") | ||
| config = configurations.respond_to?(:normalized) && configurations.normalized[spec.to_s] |
There was a problem hiding this comment.
When won't configurations respond to normalized? Think this might be too protective if it guards against people overriding ActiveRecord::Base.configurations to return whatever.
Getting rid of that check also means we can slim the diff and keep the previous easier-to-read structure of the method as a bonus 😁
| # Accepts: | ||
| # - Hash: one layer deep Hash that contains all connection information | ||
| # - Symbol: a configuration name that will be looked-up | ||
| # from the configurations Hash |
There was a problem hiding this comment.
Maybe we should leave out that configurations is a Hash 😉
| self.configurations = {} | ||
|
|
||
| # Returns fully resolved configurations hash | ||
| self.configurations = {} |
There was a problem hiding this comment.
Let's keep the previous newline below and remove the one above :)
| should_reconnect = ActiveRecord::Base.connection_pool.active_connection? | ||
| ActiveRecord::Schema.verbose = false | ||
| ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Base.configurations["test"], :ruby, ENV["SCHEMA"] | ||
| ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Tasks::DatabaseTasks.config_at("test"), :ruby, ENV["SCHEMA"] |
There was a problem hiding this comment.
Do these changes mean we've broken backwardscompatibility?
There was a problem hiding this comment.
Looks like it indeed broke.
|
|
||
| def normalized(key = @root_level) | ||
| config = key ? self[key] : self | ||
| config = config ? config.dup : {} |
There was a problem hiding this comment.
Find these two lines hard to parse because of the two conditionals. The empty hash fallback seems too defensive at first sight. What case is it trying to cover? That there's a key in self[key] but with no config at all?
Could we simplify to this?
config = fetch(key, self).dup| end | ||
|
|
||
| if url = ENV["DATABASE_URL"] | ||
| config["primary"] ||= {} |
There was a problem hiding this comment.
We've just spent lines establishing "primary", surely it should have been made an empty hash by now?
| if config | ||
| resolve_connection config | ||
| elsif env = ActiveRecord::ConnectionHandling::RAILS_ENV.call | ||
| resolve_symbol_connection env.to_sym |
There was a problem hiding this comment.
Are we removing this because resolve_connection calls it?
| config[key] = resolve(value) if value | ||
| end | ||
| config | ||
| end |
There was a problem hiding this comment.
This makes me wonder how much we need to worry about people calling all the available Hash methods on ActiveRecord::Base.configurations[some_env] and if we can just normalize the configurations up front.
| end | ||
|
|
||
| # Takes the environment such as +:production+ or +:development+. | ||
| # Takes the specification name such as +:primary+. |
| # config = { "production" => { "host" => "localhost", "database" => "foo", "adapter" => "sqlite3" } } | ||
| # spec = Resolver.new(config).spec(:production) | ||
| # config = { "host" => "localhost", "database" => "foo", "adapter" => "sqlite3" } | ||
| # spec = Resolver.new({}).spec(config) |
There was a problem hiding this comment.
This API looks weird to me. I'd expect the resolver to always be initialized with the config and resolve a spec.
| should_reconnect = ActiveRecord::Base.connection_pool.active_connection? | ||
| ActiveRecord::Schema.verbose = false | ||
| ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Base.configurations["test"], :ruby, ENV["SCHEMA"] | ||
| ActiveRecord::Tasks::DatabaseTasks.load_schema ActiveRecord::Tasks::DatabaseTasks.config_at("test"), :ruby, ENV["SCHEMA"] |
There was a problem hiding this comment.
Looks like it indeed broke.
| "development" => { "database" => "dev-db" }, | ||
| "test" => { "database" => "test-db" }, | ||
| "production" => { "database" => "prod-db" } | ||
| "development" => { "database" => "dev-db", "adapter" => "sqlite3" }, |
There was a problem hiding this comment.
Why the adapter is now required?
There was a problem hiding this comment.
I think it's because we only check for the adapter key at the moment. Finding a more accurate way to find an actual config could fix the need to change this.
There was a problem hiding this comment.
It's already required in practice for an actual config you intend to use, and while this test was accidentally assuming it, we've never claimed to be an arbitrary hash storage bucket.
|
Looks like the backwardscompatibility issues have been solved, so I'd say rebase, fix tests, run This was a hard and long haul so I definitely appreciate your willingness to run my review gauntlet so many times 😄 |
| ret | ||
| end | ||
|
|
||
| def keys |
There was a problem hiding this comment.
When is this called? My cmd-E on keys can't find any instances?
| def to_hash | ||
| @config | ||
| end | ||
| end |
There was a problem hiding this comment.
I still found a lot of this logic hard to read, so I took the liberty of trying to make it more concise:
class ConnectionConfigurations #:nodoc:
def initialize(config)
@root_config = @config = config
@database_url_config = resolve_database_url_config || {}
end
def scope=(scope)
@root_config, @config = @config, (@config[scope] || {})
end
def [](key)
@database_url_config.reverse_merge(@config[key] || @root_config[key] || {})
end
def keys
@config.keys
end
def to_h
@config
end
private
def resolve_database_url_config
if ENV["DATABASE_URL"]
ConnectionUrlResolver.new(ENV["DATABASE_URL"]).to_hash
end
end
endI'm curious why we need the @root_config[key] fallback in []?
NOTE: scope= basically assumes that it's never assigned more than once, which it doesn't seem to be in library code, but it is in test code. Would it make more sense to pass it in at the constructor? Or have a scope method that returns a new instance of ConnectionConfigurations?
| end | ||
| ret ||= @config[key] | ||
| if @database_url_config | ||
| ret = (ret || {}).merge(@database_url_config) |
There was a problem hiding this comment.
Isn't this applying the URL-supplied database to every configuration we return?
That's wrong, and I'm alarmed the tests aren't point that out. It's hard to see what's going on with them, because they're all renamed at the same time, though. 😟
Summary
New database.yml format. This is how it looks:
We are moving from a two levels config to a three levels.
This is backward compatible, apps using the default, old, 2 levels database.yml will still work.
Advantages of this change
Rails.envwhen callingestablish_connectionfor a non standard connection: i.e:establish_connection "#{Rails.env}_readonlyImplementation details
The config file might have 3 levels, however, ActiveRecord now has a
local_configurationsoption, which is scoped by the runningRails.env. That's why the code onConnectionSpecification::Resolverdidn't change much. Internally we still handle a 2 levels config Hash.However, we are keeping the
configurationsgetter/setter to be backward compatible and to be able to access it at the tasks level. (We might be able to deprecate this moving forward, that's another discussion tho)The good thing about this, is that I was able to remove the
Rails.envcall from inside AR connection code. Now AR won't depend on the RAILS_ENV anymore.Caveats
If you were using a non-default
database.yml, for instance:There will be some necessary changes in the app, as AR will scope the config by the environment, so AR would only see:
The upgrade path would be to move the
development_readonlyconfig to insidedevelopmentand reference it when callingestablish_connection. (*see the third task on future work)Future work
Change the default database.yml templateestablish_connection(:readonly)for instance.create:all/drop:alland makecreate/dropcreate all the dbs under the development/test envs?Deprecate the old template/config with a warning message?All the future work are not in this PR, because I would like to discuss one by one before implementing them, to make sure they make sense.
review @kaspth @dhh @rafaelfranca @tenderlove @jeremy
@schneems (specially on commit e51edd4, because it changes a lot from 6cc0367, However, I made sure the core functionality still works, which is make
DATABASE_URLwork without Rails.)