Add ActiveRecord::Base::normalizes#43945
Conversation
|
Build failures are related. I will investigate... |
c2ada11 to
9e0030b
Compare
|
In the previous revision, I added I thought it might also be useful for existing cases where we want different serialization behavior for querying vs persisting, e.g. #35336. This turned out to be unworkable for two reasons:
The solution I have landed on is to let types opt in to wrapping cast results in decorator objects. Then those types can unwrap or cast as necessary when serializing. |
|
Maybe I'm missing some context about the limitations at-play here, but could the Is the intent that |
|
Love this! Always wanted to have this in Rails. As normalizes :first_name, -> attr { attr.strip }Maybe that would get confusing with multiple attributes though... It seems the most used normalizers would be stripping and downcasing, normalizes :first_name, :last_name, strip: true, downcase: trueThen maybe if we extend the attributes API we could define validations and normalizers on a single line: attribute :first_name, validates: { presence: true }, normalizes: { strip: true }Although it might also be something like: attribute :first_name, validates: { presence: true }, normalizes: -> (attr) { attr.strip } |
|
Brilliant!!! |
|
I like @p8's suggestion to extend the attributes API. This feature looks awesome, but I'm hesitant to grow the ActiveRecord/Model API surface area. I'm most concerned about the growing potential for name conflicts and confusion that comes from a larger API surface area. If there's a reason extending |
|
Very interesting! I think the API needs more tweaking, but I don't know exactly what that would look like yet. In the past I've wanted this on params since the permitting is also a form of sanitizing/normalizing/canonicalizing. E.g. |
|
@kaspth For reference, the gem dry-rails offers a DSL (backed by dry-schema) to do something similar. I know that there's a lot of things on those gems, but maybe we can pull some inspiration from there. |
|
We do something somewhat similar when coercing types. user = User.new(admin: 'F')
user.admin? # false
User.where(admin: 'F') # SELECT `users`.* FROM `users` WHERE `users`.`admin` = FALSE |
|
In the past I used the AR Attribute API to normalize an attribute value (an example gist could be found here). It would be great if the same normalization is applied to # app/attribute_types/phone_number_type.rb
class PhoneNumberType < ActiveRecord::Type::String
def cast(value)
value.present? ? Phonelib.parse(value.to_s).to_s : super # https://github.com/daddyz/phonelib
end
def serialize(value)
value.present? ? Phonelib.parse(value.to_s).to_s : super
end
end
# app/models/user.rb
class User < ApplicationRecord
attribute :phone_number, PhoneNumberType.new
end
# Somewhere in the app:
User.new(phone_number: "+13478941111").phone_number # => "+13478941111"
User.new(phone_number: "+1(347)894-1111").phone_number # => "+13478941111"
User.new(phone_number: "+1 347.894.1111").phone_number # => "+13478941111"
user = User.create!(phone_number: "+1347.894.1111")
user.phone_number # => "+13478941111"
User.find_by(phone_number: "+1(347)894-1111").phone_number # => "+13478941111"
User.where(phone_number: "+1 347.894.1111").first.phone_number # => "+13478941111" |
palkan
left a comment
There was a problem hiding this comment.
I like the feature itself (have been doing similar things via Attribute API for a long time, more below).
However, the amount of internal changes and new abstractions/entities added to the codebase doesn't correlate with my gut feeling (and previous experience).
As far as I can see, most of the new code tries to solve the problem of double-casting. We introduce new (internal) API methods (#value_was_cast), entities (CastResult), callbacks.
And I doubt we need all this complexity.
Because normalizations may be applied multiple times, they should be idempotent
Shouldn't normalization be idempotent itself? Like in the example: email.strip.downcase. That the most popular use-case.
I think, non-idempotent and expensive normalizations (or conversions) should be out of scope of the model attributes; they belong to another abstraction layer (handling user/external data, probably). It's not necessary to put everything into a model.
And we already have normalization—handling boolean values. I can't see a reason why this user-defined normalization shouldn't behave the same 🤷🏻♂️
Here is an example of achieving it w/o any changes to the internals: https://gist.github.com/palkan/b8b6e8b310030acc6d4660fc5de8c790
P.S. The double-casting problem worth exploring on its own, but, IMO, it's out-of-scope for this feature.
I would very much like to, but the type decoration mechanism for Active Record attributes does not exist for Active Model attributes. My past efforts to port code from Active Record to Active Model, or to refactor code with that goal in mind, haven't met with much success. Perhaps in the future, though!
No, it should work during assignment and construction. See this test, for example. |
❤️
Yes, I wanted a clear distinction. Also, for the natural language reason I mentioned earlier.
I would prefer to not limit the API to strings only. The value passed to the normalizer is a cast value, so you can call whatever methods you need on it. If you only need to call one method, there is a shorthand: normalizes :first_name, :last_name, with: :squish(:point_up: Another reason to use a
Yes, that's an interesting idea. |
I did consider this, but all the params APIs that I came up with felt more cumbersome. I think, in part, because params can be any shape. Whereas a model API can focus on attributes. Additionally, it allows the normalizations to be applied / re-used outside of the request cycle. Also, it allows normalizations to use type-cast values. |
Yes, that would be possible!: class User < ActiveRecord::Base
normalizes :phone_number, with: -> (string) { string.present? ? Phonelib.parse(string).to_s : string }
endOr, maybe just: class User < ActiveRecord::Base
normalizes :phone_number, with: -> (string) { Phonelib.parse(string).to_s }
endAnd |
The bulk of the code is I put I felt that solving double-casting was important because I did not want users to have to consider performance when declaring a normalization. See Yuki's use case, for example. Is
I am not sure what you are asking. Maybe it's a difference in terminology? Here I am using the term "normalization" to refer to the value of the
Non-idempotent normalizations are not "supported", but they are not prohibited either. (Prohibiting them would likely be impossible.) With regards to expensive normalizations, I disagree. Again, see Yuki's use case, for example. |
Your definition of "normalization" is "do whatever you want"; technically, we can't avoid it; but semantically it sounds to me as "do some minor transformation", like, remove whitespaces or normalize a phone number (that's fast, for sure; memoizing wouldn't be noticeable).
In Ruby, we can not prohibit anything. In Rails, we can demonstrate best practices though. And not adding hidden expensive calculations to models is one such.
Considering performance lead to better software design. Isn't that something we want for Rails users? |
I like it. This is the Rails way. Having a parity with validations API ❤️ Yeah, the values are not only strings; but we can tell the same about validations. Maybe it worth starting with the |
I should have worded that more clearly. What I meant to say was: I don't want users to think twice about using
At its core, all a validation does is check that a value is a member of a set. That set might be described by a range or a regular expression, but the concept is the same. Such logic can be easily configured with data structures, as seen in the validations API. But, even so, we can still specify validations with arbitrary logic using Normalizations, on the other hand, can be more diverse. A normalization like
Normalizations are no more hidden than validations or callbacks. And, as with both validations and callbacks, we may encourage writing fast code, but the user has the freedom to trade performance for correctness when necessary. |
1ffbcbd to
ea2e4e1
Compare
|
I've made a few small changes. Most notably, you can now specify a normalization as an array of symbols: class User < ActiveRecord::Base
normalizes :email, with: [:strip, :downcase]
end |
|
@jonathanhefner One thing to consider. I was trying to test this new feature and as first I did something like this: And it declared class without errors. But the normalization wasn't working. It took me a moment to realize that I have writen |
|
What if normalization fails? What if a given value doesn't have expected type yet normalization is applied with unexpected results? What if you do not want to normalize in various contexts? In general, applying normalization outside of validation process is not a good idea. I think this feature needs more thinking and was merged too soon. |
|
@solnic it is actually not so bad. It is almost shortcut for simplified inline What's the reason to make it this special? Wouldn't make sense to enhance |
|
I’m also wondering if it makes sense to move the non-database logic to Active Model. edit: Sorry, this was already discussed extensively. |
|
@simi it is actually that bad 🙂 it adds a new dimension of complexity to ActiveRecord and I bet it will become more complicated over time with stuff like "if/unless" options etc. AR models shouldn't be bothered with this type of logic and it could be a great opportunity to think about how to enhance strong_parameters instead. |
One can disagree with the feature as written, but looking at the history of this PR, I don't think it can be said it was merged too soon.
Adding features adds complexity always. If this was a valid argument to reject stuff, software would stay frozen at version 1.0. Given one particular change, you decide whether you want it or not, acknowledging the implications, which are always trade-offs.
Using your argument, that would add complexity to strong parameters! Complexity is not the point. I honestly think there are different ways to look at this feature. You would have written it in a different way, and that is totally fine. The point is at which boundary you want to act? There are several. For you, much higher. Totally legit! In this design, in the central path to persistence. Yet another point of view would have written a database trigger so that no matter how is this field set, you get it normalized 100% guaranteed. |
Yes but it's incomparable to ActiveRecord. Possible codepaths multiple much more significantly when a feature like this is used in AR vs strong_params.
It's 100% the point. AR is known to cause so much trouble in so many Rails codebases and this makes it worse.
It's just more sensible to handle normalization outside of persistence. It's something that's context-specific, sometimes you need it, sometimes you don't. |
Rails Contributors needs to apply Unicode normalization to people's names. No exceptions. It wouldn't be a good design to identify all external entry points, and make sure all of them normalize. In this use case, you want to funnel everything down to one central normalization point, no matter the origin of the data, be it a UI, a batch process, or a test fixture. Features are designed for certain use cases. If your application has a concept of normalization that depends on context, you just don't implement it this way. That is all. If your framework had a normalization feature more decoupled and at a higher level, I would not use it for Rails Contributors because I need to dig it as deep in the edit chain as possible. To me, this feature is perfectly good for its use cases. |
🤔 And how do you benefit from the |
That normalization was written some 10 years ago, it is a simple wrapper around |
I'm actually referring to the implementation details. Currently it is shortcut to configure If I understand it well, you're referring to this https://github.com/rails/rails-contributors/blob/f5fc1ed3fd42867b3daa20df3f64801e47f31635/lib/nfc_attribute_normalizer.rb#L3. That's "normalizing" values only on assignment. That is exactly what I would expect in here as well. |
|
Does normalization run before validation? |
|
@collimarco yeah it runs when the attribute is set. @szymonlipka I agree it's subtle. But it's tricky to fix, because technically this is a valid thing to do: One idea would be to log a warning if the diff --git a/activerecord/lib/active_record/normalization.rb b/activerecord/lib/active_record/normalization.rb
index c77630f1bd..4c8ae10585 100644
--- a/activerecord/lib/active_record/normalization.rb
+++ b/activerecord/lib/active_record/normalization.rb
@@ -94,6 +94,10 @@ def normalizes(*names, with:, apply_to_nil: false)
# User.normalize(:email, " CRUISE-CONTROL@EXAMPLE.COM\n")
# # => "cruise-control@example.com"
def normalize(name, value)
+ unless self.normalized_attributes.include?(name.to_sym)
+ Rails.logger.warn "#{name} is not a normalized attribute. Did you mean to use `normalizes`?"
+ end
+
type_for_attribute(name).cast(value)
end
end@jonathanhefner do you have any thoughts on this? Now I write it out, we could even raise in this situation, since I'm not sure why you'd use |
The reason I omitted that kind of error checking from # Cat(id: integer, name: string, breed: string, lives: integer)
class Cat < ActiveRecord::Base
normalizes :breed, with: -> { _1.upcase }
end
def normalize_cat_params(cat_params)
cat_params.to_h { |name, value| [name, Cat.normalize(name, value)] }
end
normalize_cat_params({ name: "Stripes", breed: "Tabby", lives: "7" })
# => { name: "Stripes", breed: "TABBY", lives: 7 }If we added the proposed error checking to But, if there is confusion, we can rename |
|
Exciting feature! I am curious about any current thoughts as to supporting for use in non-ActiveRecord I know that @jonathanhefner addressed this in this comment saying:
But that was a year ago, so it kind of is the future, and with this closer to actually dropping into a Rails release, I was curious if there were any updated thoughts to share. It doesn't seem like it should be too hard to make ActiveModel::Attributes |
Organisations with unnecessary whitespace don't sort in an expected way. Before fixing exisiting records with this problem, this commit prevents any new occurrences. Rails 7.1 introduces ActiveRecord::Base::normalizes[1] which could be used to simplify this code. That hasn't yet been released and it seems a bit unconventional within GDS to depend on edge rails, but when 7.1 is out, we should be able to adopt that pattern. [1] rails/rails#43945
Organisations with unnecessary whitespace don't sort in an expected way. Before fixing exisiting records with this problem, this commit prevents any new occurrences. Rails 7.1 introduces ActiveRecord::Base::normalizes[1] which could be used to simplify this code. That hasn't yet been released and it seems a bit unconventional within GDS to depend on edge rails, but when 7.1 is out, we should be able to adopt that pattern. [1] rails/rails#43945
This method can be used to declare normalizations for attribute values. Normalizations are applied when attributes are assigned or updated, and the normalized values will be persisted to the database. Normalizations are also applied to matching keyword arguments of finder methods. This allows a record to be created and later queried using an unnormalized value. For example: