Sometimes you may want to standardize the data before saving it to the database. For example, downcasing emails, removing any leading and trailing whitespaces, and so on. Rails 7.1 adds ActiveRecord::Base::normalizes API, which allows you to normalize attribute values to a common format before saving them to the database. This can help improve the integrity and consistency of your data and make it simpler to query records based on attribute values.

Before Rails 7.1

  • Normalizing it in before_validation or before_save callback

    class User < ActiveRecord::Base
      before_validation do
        self.name = name.downcase.titleize # normalize_name
        self.email = email.strip.downcase # normalize_email
      end
    
      OR
    
      before_save do
        self.name = name.downcase.titleize # normalize_name
        self.email = email.strip.downcase # normalize_email
      end
    end
    
  • Normalizing by overriding attribute_writer

    class User < ActiveRecord::Base
      def name=(val)
        self.name = val.downcase.titleize
      end
    
      def email=(val)
        self.name = email.strip.downcase
      end
    end
    

Rails 7.1 onwards

class User < ActiveRecord::Base
  normalizes :name, with: -> name { name.downcase.titleize }
  normalizes :email, with: -> email { email.downcase.strip }
end

3.0.0 :001 > user = User.create(name: 'BOB', email: " BOB@EXAMPLE.COM\n")
  TRANSACTION (0.1ms)  begin transaction
  User Create (2.0ms)  INSERT INTO "users" ("name", "email") VALUES (?, ?)  [["name", "Bob"], ["email", "bob@example.com"]]
  TRANSACTION (1.9ms)  commit transaction
3.0.0 :002 > user.name
 => "Bob"
3.0.0 :003 > user.email
 => "bob@example.com"

By default, the normalization will not be applied to nil values. This behavior can be changed with the apply_to_nil option which defaults to false.

class User < ActiveRecord::Base
  normalizes :name, with: -> name { name&.downcase&.titleize || 'Untitled' }, apply_to_nil: true
end

3.0.0 :004 > user = User.create(name: nil)
  TRANSACTION (0.1ms)  begin transaction
  User Create (1.4ms)  INSERT INTO "users" ("name", "email") VALUES (?, ?)  [["name", "Untitled"], ["email", nil]]
  TRANSACTION (1.2ms)  commit transaction
3.0.0 :005 > user.name
 => "Untitled"

You can also pass multiple attributes with the normalizes method to apply normalization to all of them.

class User < ActiveRecord::Base
  normalizes :first_name, :last_name, :title, with: -> attribute { attribute.strip }
end

The normalization is also applied to the corresponding keyword argument of finder methods. This allows a record to be created and later queried using denormalized values.

3.0.0 :006 > user = User.find_by(email: "\tBOB@EXAMPLE.COM ")
  User Load (0.2ms)  SELECT "users".* FROM "users" WHERE "users"."email" = ? LIMIT ?  [["email", "bob@example.com"], ["LIMIT", 1]]
3.0.0 :007 > user.email
 => "bob@example.com"
3.0.0 :008 > User.exists?(email: "\tBOB@EXAMPLE.COM ")
  User Exists? (0.3ms)  SELECT 1 AS one FROM "users" WHERE "users"."email" = ? LIMIT ?  [["email", "bob@example.com"], ["LIMIT", 1]]
 => true

It will not be applied when we pass it as a raw query.

3.0.0 :009 > User.exists?(["email = ?", "\tBOB@EXAMPLE.COM "])
  User Exists? (0.2ms)  SELECT 1 AS one FROM "users" WHERE (email = ' BOB@EXAMPLE.COM ') LIMIT ?  [["LIMIT", 1]]
 => false

These changes also introduced the normalize_value_for class method which returns the normalized value for the given attribute.

3.0.0 :010 > User.normalize_value_for(:email, "\tBOB@EXAMPLE.COM ")
 => "bob@example.com"

How to apply normalization to existing records?

The normalization is applied when the attribute is assigned or updated. This means that if a record persisted before the normalization was declared, the record's attribute will not be normalized until it is assigned a new value. Thus, the alternative is to explicitly migrate via Normalization#normalize_attribute.

  class User < ActiveRecord::Base
    normalizes :email, with: -> email { email&.downcase&.strip }
  end

  User.find_each do |legacy_user|
    # legacy_user.email # => " BOB@EXAMPLE.COM\n"
    legacy_user.normalize_attribute(:email)
    # legacy_user.email # => "bob@example.com"
    legacy_user.save
  end

Note: The normalization may be applied multiple times, so it should be idempotent. In other words, applying the normalization more than once should have the same result as applying it only once.

References