Archive for the ‘Database’ Category

Double-Blind Test-Driven Development in Rails 3: Part 3

February 2, 2011

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

This is the last article in this series describing the concept of double-blind test-driven development. This style of testing can add time to development, but this can be cut significantly using RSpec matchers.

If you’re not familiar with matchers, they’re the helpers that give RSpec its english-like syntax, and they can be a powerful tool speeding up all of your test-driven development – whether you follow the double-blind method or not.

If you’re using RSpec, you’re already using their built-in matchers. Say we have a Site model, and its url method takes the host attribute and appends the ‘http://’ protocol. Here’s a likely test:

describe Site, 'url'
  it "should begin with http://" do
    site = Site.new :host => 'example.com'
    site.url.should equal('http://example.com')
  end
end

The equal() method in the code above is the matcher. You can pass it to any of RSpec’s should or should_not methods, and it will magically work.

But the magic isn’t that hard, and you can harness it yourself for custom matchers that conform to your application.

The Many Faces of Custom RSpec Matchers

While I don’t want this article to turn into a primer on custom RSpec matchers (it’s a little off-topic), I’ll give you the three styles of defining them, and explain my recommendations. There are simple matchers, the Matcher DSL, and full RSpec matcher classes.

Let’s start by writing a test we want to run:

it "should be at least 5" do
  6.should be_at_least(5)
end

This test should always pass, provided we’ve defined our matcher correctly. The first way to do this is the simple matcher:

def be_at_least(minimum)
  simple_matcher("at least #{minimum}"){|actual| actual >= minimum}
end

As you might guess, actual represents the object that “.should” whatever – in this case “.should be_at_least(5)”. This version makes a lot of assumptions, including the auto-creation of generic pass and fail messages.

If you want a little more control, you can step up to RSpec’s Matcher DSL. This is the middle-of-the-road option for creating custom matchers:

RSpec::Matchers.define :be_at_least do |minimum|
  match do |actual|
    actual >= minimum
  end

  failure_message_for_should do |actual|
    "expected #{actual} to be at least #{minimum}"
  end

  failure_message_for_should_not do |actual|
    "expected #{actual} to be less than #{minimum}"
  end

  description do
    "be at least #{minimum}"
  end
end

Now we’re rocking custom failure messages, and test names. This is pretty cool, and honestly how I started out doing matchers. It’s also how I started out doing the matchers for double-blind testing.

The problem is that by skipping the creation of actual matcher classes, we lose the ability to do things like inheritance. Not a big deal if our matchers stay simple, but they won’t. Not if we use them as often as we should! I found myself re-defining the same helper methods in each matcher I defined this way.

So let’s see just how daunting a full-fledged custom matcher class really is:

module CustomMatcher  
  class BeAtLeast
    def initialize(minimum)  
      @minimum = minimum
    end  
  
    def matches?(actual)  
      @actual = actual
      @actual >= @minimum
    end  
  
    def failure_message_for_should  
      "expected #{@actual} to be at least #{@minimum}"  
    end  
  
    def failure_message_for_should_not  
      "expected #{@actual} to be less than #{@minimum}"  
    end  
  end  
  
  def be_at_least(expected)  
    BeAtLeast.new(expected)  
  end  
end  

This isn’t so bad! We’re defining a new class, but you can see it doesn’t have to inherit from anything, or use any unholy Ruby voodoo to work.

We just have to define four methods: initialize, match? (which returns true or false), and the two failure message methods. Along the way, we set some instance variables so we can access the data when we need it. Finally, we define a method that creates a new instance of this class, and that’s what RSpec will rely on.

You can add as many other methods as these four will rely on. But you also get other benefits over the DSL. You can use inheritance, moving common methods up the chain so you only have to define them once, instead of in each matcher definition. You can also write setup/teardown code in your parent classes, make default arguments a breeze, and standardize any error handling. I do all of these in the matchers I created for the example app.

The bottom line is this: defining your own matcher classes directly really DRY’s up your matchers, and that always makes life simpler. I think it’s the only way to go for serious and heavy RSpec users. It allows the class for my validate_presence_of matcher to be this short and sweet:

module DoubleBlindMatchers
  class ValidatePresenceOf < ValidationMatcher
    def default_options
      {:message => "can't be blank", :with => 'x'}
    end

    def match
      set_to @options[:with]
      @object.valid?
      check !@object.errors[@attribute].include?(@options[:message]), shouldnt_exist
      
      set_to nil
      check !@object.valid?, valid_when('nil')
      check @object.errors[@attribute].include?(@options[:message])
      
      set_to ""
      check !@object.valid?, valid_when("blank")
      check @object.errors[@attribute].include?(@options[:message])
    end
  end
  
  def validate_presence_of expected, options = {}
    ValidatePresenceOf.new expected, options
  end
end

And the Teacher model, which grew considerably during our double-blind testing, now looks like this (in its entirety):

# spec/models/teacher_spec.rb

require 'spec_helper'

describe Teacher do
  it {should have_many :subjects}
  
  it {should validate_presence_of :name}
  it {should validate_length_of :name, :maximum => 50, :message => "must be 50 characters or less"}
  
  it {should validate_presence_of :salary}
  it {should validate_numericality_of :salary, :within => (20_000..100_000), :message => "must be between $20K and $100K"}
end

Summary

Now that you’ve seen my entire proposal for double-blind testing, let me know what you think. Be cruel if you must, it’s the only way I’ll learn. I’ll do the best to explain (not defend) my reasoning, and keep an open mind to changes.

I’ll also be publishing my double-blind matchers as a gem so you can add them to your project.

Double-Blind Test-Driven Development in Rails 3: Part 2

February 1, 2011

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

The last article in this series defined the concept of double-blind test-driven development, but didn’t get much into real-world examples. In this article, we’ll explore several such examples.

The Example Application

This article includes a sample app that you can download using the link above. Be sure to checkout tag “double_blind_tests” to see the code as it appears in this article. The next article will have a lot of refactoring. I limited my samples to the model layer, where 100% coverage is a very realistic goal, and this is likely to be the greatest benefit.

I chose a simple high school scheduling app with teachers, the subjects they teach, students, and courses. In this case, I’m defining a course as a student’s participation in a subject. Teachers teach (ie, have) many subjects. Students take (have) many subjects, via courses. The course record contains that student’s grade for the given subject.

The database constraints are intentionally strict, and most of the validations in the models ensure that these constraints are respected in the application layer. We don’t want the user seeing an error page because of bad data. Depending on the application, that can be worse than actually having bad data creep in.

Associations

Here’s an example of a has_many association:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  it "has many subjects" do
    teacher = Factory.create :teacher
    teacher.subjects.should be_empty

    subject = teacher.subjects.create Factory.attributes_for(:subject)
    teacher.subjects.should include(subject)
  end
end

In order to factor out our own assumptions, we have to ask what they are. The assumption is that the subject we add to the teacher’s subject list works because of the has_many relationship. So we’ll first test that teacher.subjects is, in fact, empty when we assume it would be. Then we’re free to test that adding a subject works as we expect.

Here’s a belongs_to association:

# excerpt from spec/models/subject_spec.rb

describe Subject do
  it "belongs_to a teacher" do
    teacher = Factory.create :teacher

    subject = Subject.new
    subject.teacher.should be_nil
    
    subject.teacher = teacher
    subject.teacher.should == teacher
  end
end

Again, we’re challenging the assumption that the association is nil by default, by testing against it before verifying that we can add a teacher. This tests that this is a true belongs_to association, and not simply an instance method. This is the kind of thing that can and will change over the life of an application.

Validations

Let’s test validates_presence_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "name" do
    it "is present" do
      error_message = "can't be blank"
      
      teacher = Teacher.new :name => 'Joe Example'
      teacher.valid?
      teacher.errors[:name].should_not include(error_message)

      teacher.name = nil
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)

      teacher.name = ''
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)
    end
  end
end

This example was actually explained in detail in the last article. Validate that the error doesn’t already exist before trying to trigger it. Don’t just test the default value when you create a blank object, test the likely possibilities. Refactor the error message to DRY up the test and add readability. And finally, test by modifying the object you already created (as little as possible) rather than creating a new object from scratch for each part of the test.

A more complex version is needed to validate the presence of an association:

# excerpt from spec/models/subject_spec.rb

describe Subject do
  describe "teacher" do
    it "is present" do
      error_message = "can't be blank"

      teacher = Factory.create(:teacher)
      subject = Factory.create(:subject, :teacher => teacher)
      subject.valid?
      subject.errors[:teacher].should_not include(error_message)
    
      subject.teacher = nil
      subject.should_not be_valid
      subject.errors[:teacher].should include(error_message)
    end
  end
end

While the test is more complex, the code to satisfy it is not:

# excerpt from app/models/subject.rb

validates_presence_of :teacher

testing validates_length_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "name" do
    it "is at most 50 characters" do
      error_message = "must be 50 characters or less"
      
      teacher = Teacher.new :name => 'x' * 50
      teacher.valid?
      teacher.errors[:name].should_not include(error_message)
      
      teacher.name += 'x'
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)
    end
  end
end

And here’s the model code that satisfies the test:

# excerpt from app/models/teacher.rb

validates_length_of :name, :maximum => 50, :message => "must be 50 characters or less"

While you can definitely start to see a pattern in validation testing, this introduces a new element. Instead of freshly setting the name attribute to be 51 characters long, we test the valid edge case first and then add *just* enough to make it invalid – one more character.

This does two things: it verifies that our edge case was as “edgy” as it could be, and it makes our test less brittle. If we wanted to change the test to allow up to 100 characters, we’d only have to modify the test name and the initial set value.

validating a number’s range using validates_numericality_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "salary" do
    it "is at or above $20K" do
      error_message = "must be between $20K and $100K"
      
      teacher = Teacher.new :salary => 20_000
      teacher.valid?
      teacher.errors[:salary].should_not include(error_message)

      teacher.salary -= 0.01
      teacher.should_not be_valid
      teacher.errors[:salary].should include(error_message)
    end

    it "is no more than $100K" do
      error_message = "must be between $20K and $100K"

      teacher = Teacher.new :salary => 100_000
      teacher.valid?
      teacher.errors[:salary].should_not include(error_message)
      
      teacher.salary += 0.01
      teacher.should_not be_valid
      teacher.errors[:salary].should include(error_message)
    end
  end
end

And here’s the code that satisfies the test:

# excerpt from app/models/teacher.rb

validates_numericality_of :salary, :message => "must be between $20K and $100K",
  :greater_than_or_equal_to => 20_000, :less_than_or_equal_to => 100_000

We’re doing the same here as in our testing of name’s length. We’re setting the edge value that’s *just* within the allowed range, then adding or subtracting a penny to make it invalid. I split up the top and bottom edge tests, because it’s better to test as atomically as possible – one limit per test.

Defaults

Another tricky database constraint to test for is a default value:

# excerpt from spec/models/course_spec.rb

describe Course do
  describe "grade_percentage" do
    it "defaults to 1.0" do
      course = Course.new :grade_percentage => nil
      course.grade_percentage.should be_nil
      
      course = Course.new :grade_percentage => ''
      course.grade_percentage.should be_blank
      
      course = Course.new :grade_percentage => 0.95
      course.grade_percentage.should == 0.95
      
      course = Course.new
      course.grade_percentage.should == 1.0
    end
  end
end

In this case, we can’t avoid having to recreate the model from scratch, because the nature of the implementation. There’s no actual code in the model that makes this happen, it’s purely in the database schema. Why should we test it, then? Because we test any behavior we’re going to rely on in the application. The fact that this model behavior is implemented at the database level (and therefore, not purely TDD) is a small inconvenience.

What’s the assumption our double-blind test is verifying in this case? That the value is only set in the absence of other values being explicitly assigned. Testing with nil and blank values verifies that the default doesn’t override them – it only works in the complete absence of any assignment. I also test an arbitrary (but valid) value as the anti-assumption test before finally verifying that the default is setting to the correct value.

Most default tests verify only that the correct default value is set – the double-blind version verifies that it’s acting only as a default value in all cases.

Summary

The point of double-blind testing is bullet-proof tests, that can’t be reasonably thwarted by antagonistic coding – whether that’s your anti-social pairing partner, or yourself several months down the road. The bottom line is this: test all assumptions.

That being said, this is very time consuming, and we can see a ton of repetition even in this small test suite. What we need is a way to get back to speedy testing before our boss/client notices it now takes an hour to implement one validation.*

*Even if you work for a government owned/regulated institution that actually digs that kind of non-agile perversion, you WILL eventually go insane. Even in this small sample app, the voices in my head had to talk me off a building ledge twice.

The answer lies in RSpec matchers, which are easy to implement, and can grow with your application. The benefit is not just speedier development – it’s also consistency across your application. We’ll explore that in the last article of this series.

Double-Blind Test-Driven Development in Rails 3: Part 1

January 31, 2011

This is a three-part series introducing the concept of double-blind test-driven development in Rails. This post defines the concept itself, and lays the groundwork by showing the way tests are more commonly written. The next couple posts will show how to double-blind test various common rails elements, and how to make this added layer of protection automatic and quick.

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

Looking at a rails application that was built with test-driven development, you might expect to see something like this:

# spec/models/teacher_spec.rb

describe Teacher do
  it "has many subjects" do
    teacher = Factory.create :teacher
    subject = teacher.subjects.create Factory.attributes_for(:subject)

    teacher.subjects.should include(subject)
  end
  
  describe "name" do
    it "is present" do
      teacher = Teacher.new

      teacher.should_not be_valid
      teacher.errors[:name].should include("can't be blank")
    end
    
    it "is at most 50 characters" do
      teacher = Teacher.new :name => 'x' * 51
      
      teacher.should_not be_valid
      teacher.errors[:name].should include("must be 50 characters or less")
    end
  end
end

Truth be told, if you’re seeing this in the wild the app is probably doing pretty good. This level of testing works great during the early stages of an app, when things are simple. But as things grow and/or multiple developers become involved, you need more.

Consider models where the associations and validations stretch into the dozens of lines. The more careful and specific you are about validations, the easier it is to get conflicting or overlapping validations. I actually came up with the concept of double-blind testing while retro-testing models in a client app that previously had no validation specs.

What is Double-Blind Testing?

In the world of scientific studies, you always need a control group. One set of participants gets the latest and greatest new diet pill, while the other gets a placebo. Researchers used to think this was good enough, and probably pretty funny to watch the placebo users rave about their shrinking waistlines. But it turns out studies like this still allowed some bias – as researchers observed the effects, their *own* preconceived notions tainted results. Enter the double-blind study.

In a double-blind study, the researchers themselves are unaware of which participants are in the control group, and which are being tested. Both sides are “blind”. They may have lost funny patient anecdotes, but they gained research reliability.

Applying the Lessons of Double-Blind Studies to Test-Driven Development

As I said, in the early stages of an app the tests I showed above work great, as long as you’re using TDD and the red-green-refactor cycle. This means you write the test, run it, and it fails. Then you write the simplest code that will make the test pass, run the test again, and confirm that it passes. Most testing tools will literally show red or green as you do this. Then, as you start to amass tests, you’re free to refactor your code (abstracting common code into helper methods, changing for readability, etc) and run the tests again at any time. You will see failures if you broke anything. If not, you’ve more or less guaranteed your code refactoring works properly.

The problem comes in when you start changing old code, or adding tests to processes that didn’t initially happen. What I’m calling double-blind testing is this:

each test needs to verify the object’s behavior before testing what changes.

As an example, let’s rewrite one of the tests from above:

# original test

describe "name" do
  it "is present" do
    teacher = Teacher.new

    teacher.should_not be_valid
    teacher.errors[:name].should include("can't be blank")
  end
end
# modified to be double-blind

describe "name" do
  it "is present" do
    error_message = "can't be blank"

    teacher = Teacher.new :name => 'Joe Example'
    teacher.valid?
    teacher.errors[:name].should_not include(error_message)

    teacher.name = nil
    teacher.should_not be_valid
    teacher.errors[:name].should include(error_message)

    teacher.name = ""
    teacher.should_not be_valid
    teacher.errors[:name].should include(error_message)
  end
end

This is the basic pattern for all double-blind testing. We’re not leaving anything to chance. In the original version, we expected our object to be invalid, we treated it as such, and we got the result we expected. Do you see the problem with this?

Here’s an exercise: can you make the original test pass, even though the object validation is not working correctly? There’s actually a style of pair programming that routinely does exactly this. One developer writes the test, and the other writes just enough code to make it pass, with the good-natured intention of tripping up the first developer whenever possible. If you wrote the original test, I could satisfy it by just adding the error message to every record on validation, regardless of whether it’s true! Your test would pass, but the app would fail.

The test is now “double-blind” in the sense that we as testers have factored out our own expectations from the test. In this case, we expect the error message to not be there until we initialize the object a certain way, and this can be bad. It may sound far-fetched or paranoid*, but in large codebases your original tests are often abused in this very way. The “you” that writes new code today is often at odds with the “you” from three months ago that wrote the older code with a different understanding of the problem at hand.

*Plus, everybody knows it’s not paranoia when the world really is out to get you. I’ve discussed this at length with the voices in my head, and they all agree. Except Javier. That guy’s a jerk.

Now that I’ve laid out the justification, let’s take a closer look at how the test changed. The first thing I did was create a version of the object that I believe should NOT trigger the error message. Then I run through two cases that should. You can see right away, I was forced to be more *specific* about what should trigger an error. Instead of just a blank object with no values set, I’ve proactively set the attribute in question to both nil and blank. A key element here is to try to work with the *same* object, modifying between tests, rather than creating a new object each time. My test wouldn’t have been as specific if I’d just recreated a blank Teacher object and run a single validation check.

Also, with the increased code comes the increased chance of typos. We don’t want to DRY test code up too much, because a good rule is to keep your tests are readable (non-abstract) as possible. But I’ve specified the error message at the top of the test, and reused that string over and over. I did this in a way that DRY’s the code and adds readability. You can see at a glance that all three tests are checking for the same error.

Finally, the first time I run the object’s validation, notice I’m not asserting that it should be valid. If I had written teacher.should be_valid on line 8 of the double-blind test, I’d have to take the extra time to make sure every other part of the object was valid. Not only is this time-consuming, it’s very brittle. Any future validations would break this test.

If you use factories often, you may suggest setting it up that way since a factory-generated object should always be valid. Then you could assert validity. However, this only slows down your test suite. it’s enough just to run valid? on the object, which triggers all the validation checks to load up our errors hash.

Summary

I believe this is a new concept – I was already coding most of my tests this way, but it didn’t dawn on me how valuable it was until I started retro-testing previously testless code. The value showed itself right away.

I would love to hear feedback on this – if you think it’s unnecessary (I tend to be very rainman-ish about my testing code) or even detrimental. However, if you think it’s too much work, I ask you to hold your criticism until you’ve read part 3 of this article, where I show how to use your own RSpec matchers to greatly speed this process.

Legacy Database Column Names in Rails 3

January 28, 2011

If you work with legacy databases, you don’t always have the option of changing column names when something conflicts with Ruby or Rails. A very common example is having a column named “class” in one of your tables. Rails *really* doesn’t like this, and like the wife or girlfriend who really hates your new haircut, it will complain at every possible opportunity:

# trying to set the poorly named attribute
ruby-1.9.2-p0 > u = User.new :class => '1995'
NoMethodError: undefined method `columns_hash' for nil:NilClass
# trying to set a different attribute that is only guilty by association
ruby-1.9.2-p0 > u = User.new :name
NoMethodError: undefined method `has_key?' for nil:NilClass
# trying to set the attribute later in the game
ruby-1.9.2-p0 > u = User.new
 => #<User id: nil, name: nil, class: nil, created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class = '1995'
NoMethodError: undefined method `private_method_defined?' for nil:NilClass

Like the aforementioned wife/girlfriend, you’re not going anywhere until this issue is resolved. Luckily, Brian Jones has solved this problem for us with his gem safe_attributes. Rails automatically creates accessors (getter and setter methods) for every attribute in an ActiveRecord model’s table. Trying to override crucial methods like “class” is what gets us into trouble. The safe_attributes gem turns off the creation of any dangerously named attributes.

Just do this:

# app/models/user.rb
class User < ActiveRecord::Base
  bad_attribute_names :class
end

After including the gem in your bundler, pass bad_attribute_names the list of offending column names, and it will keep Rails from trying to generate accessor methods for it. Now, this does come with a caveat: you don’t have those accessors. Let’s try to get/set our :class attribute:

ruby-1.9.2-p0 > u = User.new
 => #<User id: nil, name: nil, class: nil, created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class = '1995'
 => "1995" 
ruby-1.9.2-p0 > u
 => #<User id: nil, name: nil, class: "1995", created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class
 => User(id: integer, name: string, class: string, created_at: datetime, updated_at: datetime) 

The setter still works (I’m guessing that it was still created because there wasn’t a pre-existing “class=” method) and we can verify that the object’s attribute has been properly set. But calling the getter defaults to…well, the default behavior.

The answer is to always use this attribute in the context of a hash. You can send the object a hash of attribute names/values, and that works. This means your controller creating and updating won’t have to change. Methods like new, create, update_attribute, update_attributes, etc will work fine.

If you want to just set the single value (to prevent an immediate save, for example) do it like this:

ruby-1.9.2-p0 > u[:class] = '1996'
 => "1996" 
ruby-1.9.2-p0 > u
 => #<User id: nil, name: nil, class: "1996", created_at: nil, updated_at: nil> 

Basically, you can still set the attribute directly, instead of going through the rails-generated accessors. But we’re still one step away from a complete solution. We want to be able to treat this attribute like any other, and that requires giving it a benign set of accessors (getter and setter methods). One reason to do this is so we can use standard validations on this attribute.

Adding accessors to our model is this simple:

# add to app/models/user.rb

def class_name= value
  self[:class] = value
end
  
def class_name
  self[:class]
end

We’re calling the accessors “class_name”, and now we can use that everywhere instead of the original attribute name. We can use it in forms:

# example, not found in code

<%= f.text_field :class_name %>

Or in validations:

# add to app/models/user.rb

validates_presence_of :class_name

Or when creating a new object:

# example, not found in code

User.create :class_name => 'class of 1995'

If you download the code, these additions are test-driven, meaning I wrote the tests for those methods before writing the methods themselves, to be sure they worked properly. I encourage you to do the same.

Good luck!

Nested Comments in Ruby on Rails, part 2: Controllers and Views

January 26, 2011

  1. The Model Layer
  2. Controllers and Views
 

Part 1 of this series came out exactly 3 months and 3 days ago. Special thanks to a reader named Edward who prodded me to finally add the controllers and views to this.

Going beyond the model layer for nested comments introduces a new programming idiom: recursion. Some ruby developers may not be familiar with it – especially if your experience is mostly web-related, where the need doesn’t come up as often. Recursion in a nutshell is the act of a method calling itself. If you’ve seen Inception, The ability to have dreams within dreams within dreams means those dreams are recursive. If you haven’t seen the movie, think of russian matryoshka dolls. You won’t experience star-studded special effects with the dolls, but you’ll at least get the idea of recursion.

Unlike russian dolls or most of Leo’s recent work, recursion in software is potentially infinite. Practically speaking though, it’s more like the doll thing. After all, a system only has so many resources, and recursion is expensive in this regard – the method must copy itself in memory at each layer, local variables and all. On the plus side, they tend to be lightning fast compared to standard iteration using loops. And in our case, we’ll be hitting the database at each layer. We’ll ignore the dangers in our simple app, though.

Routing

Let’s start with our routing file:

# config/routes.rb
NestedComments::Application.routes.draw do
  resources :comments do
    resources :comments
  end

  resources :posts do
    resources :comments
  end
  
  root :to => 'posts#index'
end

Working backward, we’re making our Posts controller’s index action our default route. That’s just to get the app functional. Next comes something interesting: nesting our comments inside of our posts. Interesting, but boring. Finally, the main event: nesting our comments within our comments!

Before you get too excited and start pulling out your Nana’s childhood russian doll set for comparision, this isn’t true recursion. It’s well documented that nesting resources any more than two layers deep is painful and unnecessary, so think of this as the lamest russian doll ever.

Controllers

First, our Posts controller, which is less exciting:

# app/controllers/posts_controller.rb
class PostsController < ApplicationController
  def index
    @posts = Post.all
  end

  def show
    @post = Post.find(params[:id])
  end

  def new
    @post = Post.new
  end
  
  def create
    @post = Post.new(params[:post])
    
    if @post.save
      redirect_to posts_path, :notice => "Your post was created successfully."
    else
      render :action => :new
    end
  end
end

We’re setting up a pretty standard restful resource here, with a couple actions skipped for simplicity. Now the comments controller (get those dolls ready):

# app/controllers/comments_controller.rb
class CommentsController < ApplicationController
  before_filter :get_parent
  
  def new
    @comment = @parent.comments.build
  end

  def create
    @comment = @parent.comments.build(params[:comment])
    
    if @comment.save
      redirect_to post_path(@comment.post), :notice => 'Thank you for your comment!'
    else
      render :new
    end
  end

  protected
  
  def get_parent
    @parent = Post.find_by_id(params[:post_id]) if params[:post_id]
    @parent = Comment.find_by_id(params[:comment_id]) if params[:comment_id]
    
    redirect_to root_path unless defined?(@parent)
  end
end

It’s not much bigger, but there’s a lot going on here! First, since comments are nested, we have to look for a parent. We’re only creating comments in this example, so we only have those related actions. Comments will always be shown on a post page.

The really exciting part is after a successful comment creation. How do we redirect back to the post page? For all we know, this comment could buried down 12 layers of replies. All we really have access to so far is the parent of the object. This necessitates a new model method:

# exerpt from app/models/comment.rb
def post
  return @post if defined?(@post)
  @post = commentable.is_a?(Post) ? commentable : commentable.post
end

Recursive functions are often short and sweet for two reasons: they’re already complex by nature, and adding more code than necessary would make them unmanageable. Also, they’re getting a lot done in just a few lines. In this case, the second line is the key: if “commentable” (the parent object) is a post, return that. Otherwise, call this same method on the parent, which will in turn check if *it* is a Post, and so on.

I could have written it shorter, like this:

def post
  commentable.is_a?(Post) ? commentable : commentable.post
end

In fact, I did at first. But the extra code that checks and sets an instance variable is caching the result. This way, if we call the same method on an object more than once, it stores the result for future use. Remember, recursion can be expensive – especially when the database is involved.

Views

Finally, it’s view time, with one more bit of recursion for fun.

Or post views are standard scaffolding mostly, with the exception of the show view:

# app/views/posts/show.html.erb
<h1><%= @post.title %></h1>

<div class="body">
  <%= @post.body %>  
</div>

<h2>Comments</h2>

<p><%= link_to 'Add a Comment', new_post_comment_path(@post) %></p>

<ul class="comment_list">
  <%= render :partial => 'comments/comment', :collection => @post.comments %>
</ul>

Notice we have the partial app/views/comments/_comment.html.erb. We’re calling this for each of our post’s comments. Nothing too fancy here. Now, for the partial itself:

# app/views/comments/_comment.html.erb
<li class="comment">
  <h3><%= comment.title %></h3>

  <div class="body">
    <%= comment.body %>
  </div>
  
  <p><%= link_to 'Add a Reply', new_comment_comment_path(comment) %></p>
  
  <% unless comment.comments.empty? %>
    <ul class="comment_list">
      <%= render :partial => 'comments/comment', :collection => comment.comments %>
    </ul>
  <% end %>
</li>

This partial is recursive! The comments controller doesn’t have a show method, because we’re never going to view a comment by itself. Instead, the show-like code is in this partial, and at the end it checks to see if *this* comment has comments. If so, it calls the partial again on the whole collection. The end result is a nested, bulleted list of comments. This is not very sexy if you fire up the code yourself, but it’s a great starting point.

Summary

Hopefully this article as done a good job of explaining both recursion, and how to use it to achieve nested comments in your applications. If you’re new to recursion as a concept, haven’t seen Inception, didn’t inherit russian dolls from Nana or receive them as a snazzy graduation present, and my explanation somehow fell short, it’s a well documented programming idiom. There are tons of resources online, so take the time to learn this powerful tool, then learn not to overuse it :)

Please download the code and play with it if you want to learn more – the code is fully test-driven so you can see how that works, which is just as important.

On a final note, I’m tempted to do a follow-up article with ajax and some nicer formatting. Perhaps in 3 months and 3 days…

Nested Comments in Ruby on Rails, Part 1: Models

October 23, 2010

  1. The Model Layer
  2. Controllers and Views
 

YouTube has a pretty cool comment system. You can comment on videos, but you can also reply to comments other people have posted. In essence, you commenting on comments!

If you’d like something similar in your app, you might be tempted to create PostComments and CommentComments, or something similar. A better approach is to use polymorphic associations. Polymorphism makes it possible for a comment to belong to a post, or another comment, or any number of things. And it’s easier than you think.

Note: all examples will be in Rails 3. If you’re not familiar, that’s okay. The example code is available from github, in my examples of nested comments.

Let’s start by creating a Post model:

  rails g model post title:string body:text
  rake db:migrate

We’re keeping it simple, and we’re not going to bother with user authentication for these examples. Our post model is pretty straightforward, so we didn’t need to modify the migration file at all. Now let’s create our comments:

  rails g model comment title:string body:text commentable_id:integer commentable_type:string

This is where some of the magic comes in. We can’t say our comments “belong to posts” because they can belong to anything. So we can’t add a “post_id” field. Instead, we come up with a name for our association: “commentable”. We add a commentable_id field to store the id of the object this comment belongs to. And we add a commentable_type field to store the type of object this comment belongs to – ‘Post’, ‘Comment’, whatever. With these two pieces of info, Rails can figure out the rest and make your life easier.

Before we can add comments to our database, however, we need to make a small change the code in the “self.up” part of our migration:

  def self.up
    create_table :comments do |t|
      t.string :title
      t.string :body
      t.integer :commentable_id
      t.string :commentable_type

      t.timestamps
    end
    
    add_index :comments, [:commentable_id, :commentable_type]
  end

We’ve added an single index on the combination of the commentable fields, which will speed up our app. Now we can migrate the changes again:

  rake db:migrate

Now let’s setup our associations in our models, starting with posts:

  class Post < ActiveRecord::Base
    has_many :comments, :as => :commentable
  end

Normally if you tell Post that it has_many comments, Post would expect the comments table to have a “post_id” field. It doesn’t, because we’re using polymorphism. So we tell it the name we gave our polymorphic association: “commentable”.

Now for the slightly more complicated comment model:

  class Comment < ActiveRecord::Base
    belongs_to :commentable, :polymorphic => true
    has_many :comments, :as => :commentable
  end

First, we’re setting up the “belongs to” side of polymorphism. Our comment belongs to “commentable”, the name we gave our polymorphic association. We also tell it that this is polymorphic. Otherwise, Rails would look for a model called Commentable.

Finally, we’re saying that comments have many comments, the same way set did it for posts.

You might thing that after this special setup, using polymorphic associations would be more difficult as well. The good news is, the hard part is over and everything else works the same as any other “belongs_to” or “has_many” association. Let’s go into the rails console to try it out:

  post = Post.create :title => 'First Post'
  => #<Post id: 1, title: "First Post", body: nil, created_at: "2010-10-23 16:56:13", updated_at: "2010-10-23 16:56:13"> 

  comment = post.comments.create :title => 'First Comment'
  => #<Comment id: 1, title: "First Comment", body: nil, commentable_id: 1, commentable_type: "Post", created_at: "2010-10-23 16:56:40", updated_at: "2010-10-23 16:56:40"> 

  reply = comment.comments.create :title => 'First Reply'
  => #<Comment id: 2, title: "First Reply", body: nil, commentable_id: 1, commentable_type: "Comment", created_at: "2010-10-23 16:59:28", updated_at: "2010-10-23 16:59:28"> 

We’re able to add a comment to our post, and add a comment to that comment! Rails does the work of filling in the commentable_id and commentable_type fields, just as it would have filled in the post_id field if comments could only belong to posts.

Please check out the nested comments example code on github, which includes tests. Download it play around with it, and see how it works. In the next part, I’ll be looking at how to use nested comments in your controllers and views.

Common Addresses Using Polymorphism and Nested Attributes in Rails

October 19, 2010

Have you ever wanted to take an object that is common to a lot of models – like addresses – and DRY up your code? If you’re really concerned about design, or if the object itself is complex, you certainly want to make changes in one place and have them apply to everywhere in your app.

You can use a combination of polymorphism, nested attributes, and shared views to accomplish this easily. If it seems complicated at first, try it a couple times and you’ll see it’s no big deal. Let’s get started.

Polymorphism

This is the concept that something like an address can belong to more than one type of thing. Maybe your app has customers, employees, and locations, which all need addresses and you’d like them to appear uniform from one form to the next. Let’s create a migration and model for a simple address.

In your migration, we need the address fields you plan to use, plus the two fields that make polymorphism possible in rails: `object_id` and `object_type`. This is how rails will know what object type and id each address belongs to:

    create_table :addresses do |t|
      t.string :line1
      t.string :line2
      t.string :city
      t.string :state
      t.string :zip
      t.integer :addressable_id
      t.string :addressable_type

      t.timestamps
    end
 
    add_index :addresses, [:addressable_type, :addressable_id], :unique => true

In your model, we need to tell `address` that is belongs to other things polymorphically:

class Address < ActiveRecord::Base
  belongs_to :addressable, :polymorphic => true
end

Nested Attributes

Now let’s make the customer model “contain” an address. In addition to the `has_one` association, we’re going to tell rails that customer forms might also contain fields for the customer’s address as well:

class Customer < ActiveRecord::Base
  has_one :address, :as => :addressable
  accepts_nested_attributes_for :address
end

Shared Views

Next, we’ll want to make a partial to store the address form, that can be called from our other views. Put this in `app/views/shared/_address.html.erb`:

<p>
  <%= f.label :line1, 'Address 1' %><br />
  <%= f.text_field :line1 %>
</p>

<p>
  <%= f.label :line2, 'Address 2' %><br />
  <%= f.text_field :line2 %>
</p>

<p>
  <%= f.label :city %><br />
  <%= f.text_field :city %>
</p>

<p>
  <%= f.label :state %><br />
  <%= f.text_field :state, :size => 2 %>
</p>

<p>
  <%= f.label :zip, 'Zip Code' %><br />
  <%= f.text_field :zip %>
</p>

Where does the `f` come from in the view above? It’s passed in by any view that wants to use it. Let’s use the new customer form as an example:

<% form_for(@customer) do |f| %>
  <%= f.error_messages %>

  <p>
    <%= f.label :name %><br />
    <%= f.text_field :name %>
  </p>
  
  <% f.fields_for :address do |address| %>
    <%= render :partial => 'shared/address', :locals => {:f => address} %>
  <% end %>
  
  <p>
    <%= f.submit 'Create' %>
  </p>
<% end %>

The `fields_for` call above is the secret sauce, calling our shared partial and passing the form object `address` into the partial as the variable `f`. When you view the new customer page, you’ll see the address fields included as if they were part of the customer form directly. The real power is that you can repeat these steps for all other models that have an address, and they’ll use the same database table, model, and shared view for addresses.

…And Other Junk

There’s one last step. In the `new` action of your `customers` controller, you need to build the address object onto the customer object, so rails knows it’s there:

  def new
    @customer = Customer.new
    @customer.build_address
  end

That’s all it takes to get polymorphism, nested attributes, and shared views working together to DRY up your code, data modeling, and views. And DRYer code equals happier coders! For more info on polymorphism and nested attributes, visit your friendly neighborhood documentation:

http://wiki.rubyonrails.org/howtos/db-relationships/polymorphic
http://api.rubyonrails.org/classes/ActiveRecord/NestedAttributes/ClassMethods.html

Review of the “Database is Your Friend” Workshop by Xavier Shay

August 23, 2010

Xavier ShayThis Saturday I had the privilege of attending Xavier Shay’s “Database is Your Friend” workshop right here in Kansas City. It explores the enterprise domain of high-traffic, mission-critical databases in Rails. At $350, the price tag trumps just about every regional Ruby or Rails conference I’ve heard of. It was worth the price, and more. It would have been well worth the expense of travel and lodging if it had been out of town.

I’ve never had such a thorough learning experience. Xavier had a git repository setup with several branches of a basic Rails app, which we all cloned. He would give a 10-15 minute overview of a difficult concept, then we would checkout a given branch, and spend about 20 minutes completing whatever part of the code Xavier had omitted. We used test-driven development much of the time, which forced us to fully understand the cause and effect of each technique.

Xavier was in constant motion, visiting each of the six students to ensure everybody got it. After all, the exercises themselves were mission critical, because the following lessons built upon them. After experiencing this, I wonder why all teaching isn’t done this way. The “Long Lecture, Here’s Your Homework, Now Get Out” system I remember from college could benefit a lot from this.

It’s obvious a lot of love went into the design of the workshop. The flow was so natural that several times, one of us would ask a question and Xavier would answer that the next segment addresses it. The result was a natural progression of solving more and more complex problems.

If you struggle with (or wonder about) data integrity and high-traffic database issues, take the opportunity to learn from Xavier Shay on his current tour.

PS –

While this workshop is definitely worth travelling, I didn’t have to. Wes Garrison of Databasically, who helps organize our monthly Ruby meetings, took the initiative. Xavier normally attaches his workshop to conferences, but Wes looked at the schedule and saw that Xavier had a small gap between his Chicago and Austin dates. Wes offered to arrange both travel and lodging for Xavier, who luckily agreed to squeeze another workshop into his busy schedule. Wes, thank you.

Basic many-to-many Associations in Rails

January 29, 2010

View the Source Code

Many-to-many relationships

Data modeling is the science (and art) of creating the database schema that most purely matches the real world objects involved in your project. Part of this is defining how the objects relate to one another. Let’s say your application tracks Items and Categories. If each item can only belong to one category, then you have a one-to-many relationship; categories have many items. But if an item can appear in more than one category, you have a many-to-many relationship.

There are two ways to handle many-to-many relationships in Ruby on Rails, and this article will cover both.

has_and_belongs_to_many

The simplest approach is if you don’t need to store any information about the relationship itself. You just want to know what items are in each category, and what categories each item belongs to. This is called “has_and_belongs_to_many”. We use has_and_belongs_to_many associations in our models, and create a join table in our database. Here are your models:

# app/models/category.rb
class Category < ActiveRecord::Base
  has_and_belongs_to_many :items
end

# app/models/item.rb
class Item < ActiveRecord::Base
  has_and_belongs_to_many :categories
end

Next, let’s create the join table by generating a new migration. From the command line:

script/generate migration AddCategoriesItemsJoinTable

Now we’ll edit the migration file it creates:

class AddCategoriesItemsJoinTable < ActiveRecord::Migration
  def self.up
    create_table :categories_items, :id => false do |t|
      t.integer :category_id
      t.integer :item_id
    end
  end

  def self.down
    drop_table :categories_items
  end
end

Notice the :id => false, which keeps the migration from generating a primary key. The name of the table is a combination of the two table names we’re joining, in alphabetical order. This is how Rails knows how to find the join table automatically.

has_many :through

The other way to setup a many-to-many relationship between objects is used if you do, or think you will, need to track info on the relationship itself. When was item X added to category Y? That’s info you can’t store in the category or item tables, because it’s info about the relationship. In Rails, this is called a has_many :through association, and it’s really just as easy as the first way.

First, we’re going to create a new model, that defines the relationship between items and categories. For back of a better name, let’s call it a Categorization. Setup your models like this:

# app/models/category.rb
class Category < ActiveRecord::Base
  has_many :categorizations
  has_many :items, :through => :categorizations
end

# app/models/item.rb
class Item < ActiveRecord::Base
  has_many :categorizations
  has_many :categories, :through => :categorizations
end

# app/models/categorization.rb
class Categorization < ActiveRecord::Base
  belongs_to :category
  belongs_to :item
end

We’re connecting both original models to :categorizations, and then connecting the them to each other via the intermediary Categorization model. Now, instead of a join table whose only function is connecting the others, we add a full-fledged table to manage our new model:

class CreateCategorizations < ActiveRecord::Migration
  def self.up
    create_table :categorizations do |t|
      t.integer :category_id
      t.integer :item_id

      t.timestamps
    end
  end

  def self.down
    drop_table :categorizations
  end
end

We still have the two foreign key integer columns, but we’ve removed :id => false so this table will have an id column of its own. We also added timestamps, so we’ll be able to tell when an item was added to a specific category. I also created a migration that removes the old categories_items table, but it’s not shown here.

Which is Better?

The simpler has_and_belongs_to_many approach has a small advantage when you *know* you’re not going to need to track info about the relationship itself. If this is the case, there’s a very slight performance gain because you’re not loading an extra model class at runtime.

More often than not, however, you’re going to eventually want to track relationship-specific data. We used the example of tracking when a relationship was created. Another would be if you want to track, over time, how many times a visitor clicks on an item under each category. That counter needs to be stored in the Categorization model, and that’s a reason not to use the simpler has_and_belongs_to_many approach.

I’ve created an example application (get it here) with tags for each version – has_and_belongs_to_many, and has_many :through.

Nesting your has_many :through relationships

January 28, 2010

View the Source Code

Let’s say you’re creating a site where people can track their memberships in various store clubs – from grocery store loyalty cards, to memberships to Sam’s or CostCo. People and Stores have a many-to-many relationship called a Membership. Stores also have sales, and you want people to be able to manage all sales at all of their stores easily.

The Problem

You’d like to be able to say @member.sales, but there’s a problem – Rails doesn’t support daisy-chaining associations the way we’d like. Here’s how we want to setup our associations:

# app/models/member.rb
class Member < ActiveRecord::Base
  has_many :memberships
  has_many :clubs, :through => :memberships
  has_many :sales, :through => :clubs
end
# app/models/club.rb
class Club < ActiveRecord::Base
  has_many :memberships
  has_many :members, :through => :memberships
  
  has_many :sales
end
# app/models/membership.rb
class Membership < ActiveRecord::Base
  belongs_to :member
  belongs_to :club
end
# app/models/sale.rb
class Sale < ActiveRecord::Base
  belongs_to :club
end

And for reference, here’s the full schema:

# db/schema.rb
ActiveRecord::Schema.define(:version => 20100129152803) do
  create_table "clubs", :force => true do |t|
    t.string   "name"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  create_table "members", :force => true do |t|
    t.string   "name"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  create_table "memberships", :force => true do |t|
    t.integer  "member_id"
    t.integer  "club_id"
    t.date     "expires"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  create_table "sales", :force => true do |t|
    t.string   "name"
    t.text     "description"
    t.date     "start"
    t.date     "end"
    t.integer  "club_id"
    t.datetime "created_at"
    t.datetime "updated_at"
  end
end

While everything looks good from a data modeling perspective, there’s one issue – our member model isn’t allowed to daisy-chain assocations, so we can’t get to our sales easily. A call to @member.sales gives us this:

ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column: clubs.member_id: SELECT "sales".* FROM "sales"  INNER JOIN "clubs" ON "sales".club_id = "clubs".id    WHERE (("clubs".member_id = 1)) 

The Solution

In comes Ian White’s nested_has_many_through plugin, which does exactly what you’d think Rails does already. Without changing the way you create associations, this plugin “just works” out of the box. Nothing to include in models or config files. Here’s how you install it:

script/plugin install git://github.com/ianwhite/nested_has_many_through.git

Now run @member.sales and you get what you’d expect – a list of all sales that a member is entitled to attend. You can nest even deeper if you like, but I offer this word of caution. Nested has_many :through associations are like the most precious liquid on earth: Captain Morgan’s spiced rum. Enjoy in moderation :)

Epilogue

I’ve created a full rails app on GitHub (View the Source Code) so you can download and play around with it. There are “before” and “after” tags, so you can see how the app reacts with and without the plugin. It also has Shoulda tests.


Follow

Get every new post delivered to your Inbox.