We often come across use cases in a Rails application where we need to create, update, or delete records in bulk. Most ORMs naively update or insert each record individually by executing N+1 database queries. To address this problem, Rails 6 added the upsert_all method that provides a faster way to update and insert a large number of records.

upsert_all uses a single SQL INSERT statement for updating/inserting the records. And it does so without instantiating any models. Nor does it trigger any Active Record callbacks or validations !

How upsert_all works?

If we check the source code of upsert_all, we understand that it calls execute method of InsertAll under the hood which later updates or inserts (upserts) multiple records into the database using a single SQL INSERT statement.

For Postgres and SQlite3 upsert_all utilizes ON CONFLICT clause, whereas for MySQL it uses ON DUPLICATE KEY clause.

Check implementation of build_insert_sql method under postgresql_adapter.rb, sqlite3_adapter.rb and abstract_mysql_adapter.rb to get further details.

In this article, we will first see the available options with upsert_all in Rails 6, and then the newly introduced ones in Rails 7.

Available options for upsert_all in Rails 6

:returning

(PostgreSQL only) An array of attributes to return for all successfully inserted records, which by default is the primary key. Pass returning: %w[ id name ] for both id and name or returning: false to omit the underlying returning SQL clause entirely.

You can also pass an SQL string if you need more control on the return values (for example, returning: "id, name as new_name").

:unique_by

(PostgreSQL and SQLite only) By default rows are considered to be unique by every unique index on the table. Any duplicate rows are skipped. To skip rows according to just one unique index pass :unique_by.

Consider a Book model where no duplicate ISBNs make sense, but if any row has an existing id, or is not unique by another unique index, ActiveRecord::RecordNotUnique is raised.

Unique indexes can be identified by columns or name:

unique_by: :isbn
unique_by: %i[ author_id name ]
unique_by: :index_books_on_isbn

New options added in Rails 7

:update_only

This option is added with the intent to make the conflict resolutions easy during upsert_all. It accepts a list of column names that requires to be updated in case of conflict. If no column is provided, upsert_all will update all the columns that can be updated.

# There is a conflict in the first and the third record. As only price is passed 
# to "update_only", only price will be updated and the author would remain
# the same i.e 'Icode' 

Book.upsert_all(
  [
    { name: 'Ruby for beginners', price: 150, author: 'Icode' },
    { name: 'Well-Grounded Rubyist', price: 200, author: 'David A' },
    { name: 'Ruby for beginners', price: 300, author: 'Icode Academy' },
  ],
  unique_by: :name,
  update_only: [:price]
)


Book.where(name: 'Ruby for beginners')
=> #<ActiveRecord::Relation [#<Book id: 1, name: "Ruby for beginners", price: 300, author: "Icode">]>

:on_duplicate

Even this option is added with the same intent of making conflict resolutions easy during upsert_all. It can not be used along with :update_only option. While using this option, one must provide all the columns that need to be updated.

# There is a conflict in the first and the third record. As only price is passed
# to "on_duplicate" as a raw SQL query, just price will be updated and
# the author would remain the same i.e 'Icode'

Book.upsert_all(
  [
    { name: 'Ruby for beginners', price: 150, author: 'Icode' },
    { name: 'Well-Grounded Rubyist', price: 200, author: 'David A' },
    { name: 'Ruby for beginners', price: 300, author: 'Icode Academy' },
  ],
  unique_by: :name,
  on_duplicate: Arel.sql("price = LEAST(books.price, EXCLUDED.price)")
)

Book.where(name: 'Ruby for beginners')
=> #<ActiveRecord::Relation [#<Book id: 1, name: "Ruby for beginners", price: 150, author: "Icode">]>

:record_timestamps

By default, the automatic setting of timestamp columns is controlled by the model's record_timestamps config. To override this and force the automatic setting of timestamp columns, the :record_timestamps option can be used.

When record_timestamps is set to false, value for created_at and updated_at are not recorded.

Book.upsert_all(
  [
    { id: 1, name: 'Ruby for beginners', price: 1500 },
    { id: 2, name: 'The Well-Grounded Rubyist', price: 2000 },
  ],
  unique_by: :name,
  record_timestamps: false
)

When record_timestamps is set to true, value for created_at and updated_at are recorded. And the value is set based on the value of CURRENT_TIMESTAMP

Book.upsert_all(
  [
    { id: 1, name: 'Ruby for beginners', price: 1500 },
    { id: 2, name: 'The Well-Grounded Rubyist', price: 2000 },
  ],
  unique_by: :name,
  record_timestamps: true
)

References