Sometimes you may want to standardize the data before saving it to the database. For example, downcasing emails, removing any leading and trailing whitespaces, and so on. Rails 7.1
adds ActiveRecord::Base::normalizes API
, which allows you to normalize attribute values to a common format before saving them to the database. This can help improve the integrity and consistency of your data and make it simpler to query records based on attribute values.
Before Rails 7.1
-
Normalizing it in
before_validation
orbefore_save
callbackclass User < ActiveRecord::Base before_validation do self.name = name.downcase.titleize # normalize_name self.email = email.strip.downcase # normalize_email end OR before_save do self.name = name.downcase.titleize # normalize_name self.email = email.strip.downcase # normalize_email end end
-
Normalizing by overriding
attribute_writer
class User < ActiveRecord::Base def name=(val) self.name = val.downcase.titleize end def email=(val) self.name = email.strip.downcase end end
Rails 7.1 onwards
class User < ActiveRecord::Base
normalizes :name, with: -> name { name.downcase.titleize }
normalizes :email, with: -> email { email.downcase.strip }
end
3.0.0 :001 > user = User.create(name: 'BOB', email: " BOB@EXAMPLE.COM\n")
TRANSACTION (0.1ms) begin transaction
User Create (2.0ms) INSERT INTO "users" ("name", "email") VALUES (?, ?) [["name", "Bob"], ["email", "bob@example.com"]]
TRANSACTION (1.9ms) commit transaction
3.0.0 :002 > user.name
=> "Bob"
3.0.0 :003 > user.email
=> "bob@example.com"
By default, the normalization will not be applied to nil
values. This behavior can be changed with the apply_to_nil
option which defaults to false
.
class User < ActiveRecord::Base
normalizes :name, with: -> name { name&.downcase&.titleize || 'Untitled' }, apply_to_nil: true
end
3.0.0 :004 > user = User.create(name: nil)
TRANSACTION (0.1ms) begin transaction
User Create (1.4ms) INSERT INTO "users" ("name", "email") VALUES (?, ?) [["name", "Untitled"], ["email", nil]]
TRANSACTION (1.2ms) commit transaction
3.0.0 :005 > user.name
=> "Untitled"
You can also pass multiple attributes with the normalizes
method to apply normalization to all of them.
class User < ActiveRecord::Base
normalizes :first_name, :last_name, :title, with: -> attribute { attribute.strip }
end
The normalization is also applied to the corresponding keyword argument of finder methods. This allows a record to be created and later queried using denormalized values.
3.0.0 :006 > user = User.find_by(email: "\tBOB@EXAMPLE.COM ")
User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."email" = ? LIMIT ? [["email", "bob@example.com"], ["LIMIT", 1]]
3.0.0 :007 > user.email
=> "bob@example.com"
3.0.0 :008 > User.exists?(email: "\tBOB@EXAMPLE.COM ")
User Exists? (0.3ms) SELECT 1 AS one FROM "users" WHERE "users"."email" = ? LIMIT ? [["email", "bob@example.com"], ["LIMIT", 1]]
=> true
It will not be applied when we pass it as a raw query.
3.0.0 :009 > User.exists?(["email = ?", "\tBOB@EXAMPLE.COM "])
User Exists? (0.2ms) SELECT 1 AS one FROM "users" WHERE (email = ' BOB@EXAMPLE.COM ') LIMIT ? [["LIMIT", 1]]
=> false
These changes also introduced the normalize_value_for
class method which returns the normalized value for the given attribute.
3.0.0 :010 > User.normalize_value_for(:email, "\tBOB@EXAMPLE.COM ")
=> "bob@example.com"
How to apply normalization to existing records?
The normalization is applied when the attribute is assigned or updated. This means that if a record persisted before the normalization was declared, the record's attribute will not be normalized until it is assigned a new value. Thus, the alternative is to explicitly migrate via Normalization#normalize_attribute
.
class User < ActiveRecord::Base
normalizes :email, with: -> email { email&.downcase&.strip }
end
User.find_each do |legacy_user|
# legacy_user.email # => " BOB@EXAMPLE.COM\n"
legacy_user.normalize_attribute(:email)
# legacy_user.email # => "bob@example.com"
legacy_user.save
end
Note: The normalization may be applied multiple times, so it should be idempotent. In other words, applying the normalization more than once should have the same result as applying it only once.