Touch me: cleaning up Rails association code

Yeah, this is one of those posts where I stop wittering on about what I’ve seen at the theatre or in the cinema, and talk about computer code I’ve been writing. Odd mix for a blog, I know: welcome to my world…

In a Rails project I’m currently working on, I have a parent object that has a :has_many collection of children. In this case, they’re all time based, so let’s call the parent class Calendar and the child association Event:

Class Calendar < ActiveRecord::Base
  has_many :events
end

Class Event < ActiveRecord::Base
  belongs_to :calendar
end

(As you may have guessed, these aren’t the real class names. Confidentiality, and all that.)

Many operations on our Calendar object rely on the overall date range between the first and last dates in the list of events. Now we can, if we want to, calculate those on the fly using associations:

Class Calendar < ActiveRecord::Base
  has_many :events

  def first_date
    events.order('starts_on ASC').first.starts_on
  end
end

…and so on. Again, we have quite a few of these methods, and they’re actually a bit more complicated than this.

For one-off operations, deriving attributes in this way isn’t bad at all. In any situation where we’re loooping through loads of Calendar objects, or need to filter based on the value of the :first_date attribute, though, it’s very easy to write code that results in loads of calls to the database, reducing some pages to a slow crawl.

While it’s possible to reduce the number of queries to the database by judicious use of joins, eager loading and other ActiveRecord features, there comes a time when it’s just easier to calculate the values whenever they’re likely to change and store the value in the parent Calendar object. That way, filtering and sorting becomes a doddle.

Rails provides one type of caching attribute with the :counter_cache option of the belongs_to definition. Implement this, and every time you add or remove a child object, the parent object’s #{table_name}_count attribute gets amended.

However, for our purposes that’s not helping us, as our calculations are dependent on the contents of our child objects, not just their quantity. While I could use an after_save callback in my child objects to then call a recalculation method in the parent object, it turns out there’s a simpler way which doesn’t get much recognition: the :touch option of the belongs_to definition.

If you specify a true value to :touch, the parent object’s updated_at timestamp gets updated whenever a child object is saved. You can also pass a symbol to specify an additional timestamp column that only records when the association has been changed (we could call it events_updated_at in this case). if you do specify a symbol, recognise that both your custom column and updated_at will be updated.

However, the best thing for me about this method is that the parent object has an after_touch callback method that will then run once the timestamps have been updated.

So, assuming we’ve added a first_date column to the Calendar object, our code example becomes:

Class Calendar < ActiveRecord::Base
  has_many :events

  after_touch :recalculate_cached_columns

  def recalculate_cached_columns
    update_attributes(
      first_date: events.order('starts_on ASC').first.starts_on,
      last_date:  events.order('ends_on DESC').first.ends_on,
      # ...other columns
    )
  end
end

Class Event < ActiveRecord::Base
  belongs_to :calendar, touch: true
end

We end up doing multiple SELECT and UPDATE statements on the database whenever a new event is added, but those actions are relatively rare, compared to searching – and that becomes much faster, because we’re now only searching by indexable attributes in a single model.

If those multiple database queries on update do become an issue (as they do in my real-world application, where the calculations are quite intensive), you can run that method asynchronously. For example, using Delayed::Job:

Class Calendar < ActiveRecord::Base
  has_many :events

  after_touch :recalculate_cached_columns

  def recalculate_cached_columns
    update_attributes(
      first_date: events.order('starts_on ASC').first.starts_on,
      last_date:  events.order('ends_on DESC').first.ends_on,
      # ...other columns
    )
  end

  handle_asynchronously :recalculate_cached_columns
end

I’m really pleased with the organisation of the code in this new approach – I’ve got a significant performance gain without introducing lots of spaghetti code, and any complex queries I need are confined to one method.

Of course, this won’t be a suitable approach in all situations – say, if your parent object has two child collections, and you want to use this approach for both – the after_touch callback would be called whenever an object in collection A or collection B has been saved, and that behaviour might not suit you.

Published by

Scott Matthewman

Formerly Online Editor and Digital Project Manager for The Stage, creator of the award-winning The Gay Vote politics blog, now a full-time software developer specialising in Ruby, Objective-C and Swift, as well as a part-time critic for Musical Theatre Review, The Reviews Hub and others.

2 thoughts on “Touch me: cleaning up Rails association code”