has_few :god_objects

Simplify your model code in Rails by avoiding association pollution

In Rails applications, there are often two god objects: the primary business object and User. One way of keeping down the complexity and fan-out is to avoid adding unnecessary bidirectional relationships: those where both a belongs_to and a has_many or has_one relationship are defined. These has_many and has_one associations attract behavior, which helps objects that could otherwise have fewer responsibilities trend toward becoming god objects. You can keep your Rails app god object-free and a delight to work with by avoiding bidirectional relationships. I’ll show you how.

We can achieve better object design and smaller interfaces by defining associations in one direction only.

Before working on the project where I experimented with Sandi Metz’ rules, I had a habit of adding relationships in both directions whenever a new model was introduced. I’ll use User as an example here because most applications have this object, but the concept of association pollution applies to all objects.

The Code

Let’s look at a bidirectional ActiveRecord relationship. Pretend we’re making a long form blogging app named “long-ium”. Each page keeps track of its view counts already. We are tasked with ensuring they’re unique and want to start tracking the user that viewed the page. We add some code like this to our PageView model:

class Pageview
  belongs_to :page
  belongs_to :user # <== Just added
end

Now while you’re adding that you might be tempted to let User know about its newfound child:

class User
  has_many :pageviews
end

This new has_many association allows us to track whether a specific user has read a specific long-ium article with: current_.pageviews.where(page: page). From my experience this helper isn’t much better than writing PageViews.where(page: page, user: current_user) and it will be rarely used.

A user can exist without a pageview, but Pageview is meaningless without both its user and its page. This is hinted by the name of the association macro: belongs_to implies correctly that Pageview is dependent on User and Page.

What’s the big problem?

If User not needing to know about a Pageview association were the extent of the problem, this post wouldn’t be worth writing. The problem stems from my experience that associations attract behavior.

Having has_many :pageviews means that there may be instance methods on User that deal with pageviews, steps in factory or seed generation to create pageviews along with users, and other behaviors that contribute to User becoming a god object.

A solution: leave out has_many!

PageView can easily wrap such a concept into a scope such that the interface becomes Pageview.for_user(user).

class PageView
  belongs_to :user

  def self.for_user(user)
    where(user: user)
  end
end

Should the query become more complex than a single line as business needs change, a query object or database view can be used as well. Consider this potential example, where complexity is neatly hidden from both User and Pageview, and we have not changed the public interface.

class PageView
  belongs_to :user

  def self.for_user(user)
    PageviewsForNewActiveUsers.for(user)
  end
end

class PageViewsForNewActiveUsers
  def for(user)
    # long-winded, complex logic that is now isolated
  end
end

I have found that while it won’t always be the case, many associations on User where user does not have the foreign key are not necessary.

Performance Concerns

There is one particularly good reason to implement inverse associations (has_many, has_one): performance. Nate Berkopec is one of the most knowledgeable people about Rails performance. He cautions against using where or scopes (either with the scope DSL or class methods as above with for_user) when rendering collections. Specifically, when you’re iterating over a collection, it’s not possible to preload anything but a proper association with includes so you’ll want to avoid code that calls these methods inside of nested loops.

<% Post.where(user: user) do |post| %>
  <%= post.comments.active %>
  <!-- or -->
  <% post.comments.where(deleted_at: nil) do |comment| %>
    <%= comment.body %>
  <% end %>
<% end %>

Rather, define an association on Post such as has_many :active_comments, -> { where(deleted_at: nil } which can both be passed to includes and referenced directly in your nested loop.

<% Post.where(user: user).includes(:active_comments) do |post| %>
  <% post.active_comments.each do |comment| %>
    <%= comment.body %>
  <% end %>
<% end %>

Note that it’s fine to continue to use Post.where here and avoid User.has_many :posts; it’s only necessary to define the inverse has_many association for the nested loop. Nate’s article goes into more detail about this.

Wrapping up

Many associations that are defined in two directions are extraneous. Defining the association only on the model that stores the foreign key can help to reduce “association pollution” in god objects that tend to attract behavior.

By keeping associations off of these models, I am also able to avoid temptation to add further behavior into their classes. This leaves me with a simpler, easier to understand model.