Archive for the ‘Rails’ Category

Code Retreat in Boulder, Colorado

February 26, 2011

I’m in beautiful downtown Boulder, getting ready to attend a code retreat with Ruby greats like Corey Haines, Chad Fowler, Dave Thomas*, Mike Clark, Michael Feathers and many more.

Last night I hung out with some KC friends and we setup our dev environments for the event. I got motivated, and created a base environment on GitHub you can download. It runs your tests automatically using Watchr every time you save your code file, and if you’re on a mac it even takes a screen shot at each save! Now you can go back and relive the magic. Maybe string them together into a video with a little commentary, and boom – easy post-retreat blog video.

Use the link above, and let me kno w if it was useful!

*not the Wendy’s guy, as my wife likes to ask. You’d think since the world is down to just one living, notable Dave Thomas that joke would get a little old. I think the people who grew up watching Wendy’s commercials will also have to die out first :)

Double-Blind Test-Driven Development in Rails 3: Part 3

February 2, 2011

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

This is the last article in this series describing the concept of double-blind test-driven development. This style of testing can add time to development, but this can be cut significantly using RSpec matchers.

If you’re not familiar with matchers, they’re the helpers that give RSpec its english-like syntax, and they can be a powerful tool speeding up all of your test-driven development – whether you follow the double-blind method or not.

If you’re using RSpec, you’re already using their built-in matchers. Say we have a Site model, and its url method takes the host attribute and appends the ‘http://’ protocol. Here’s a likely test:

describe Site, 'url'
  it "should begin with http://" do
    site = Site.new :host => 'example.com'
    site.url.should equal('http://example.com')
  end
end

The equal() method in the code above is the matcher. You can pass it to any of RSpec’s should or should_not methods, and it will magically work.

But the magic isn’t that hard, and you can harness it yourself for custom matchers that conform to your application.

The Many Faces of Custom RSpec Matchers

While I don’t want this article to turn into a primer on custom RSpec matchers (it’s a little off-topic), I’ll give you the three styles of defining them, and explain my recommendations. There are simple matchers, the Matcher DSL, and full RSpec matcher classes.

Let’s start by writing a test we want to run:

it "should be at least 5" do
  6.should be_at_least(5)
end

This test should always pass, provided we’ve defined our matcher correctly. The first way to do this is the simple matcher:

def be_at_least(minimum)
  simple_matcher("at least #{minimum}"){|actual| actual >= minimum}
end

As you might guess, actual represents the object that “.should” whatever – in this case “.should be_at_least(5)”. This version makes a lot of assumptions, including the auto-creation of generic pass and fail messages.

If you want a little more control, you can step up to RSpec’s Matcher DSL. This is the middle-of-the-road option for creating custom matchers:

RSpec::Matchers.define :be_at_least do |minimum|
  match do |actual|
    actual >= minimum
  end

  failure_message_for_should do |actual|
    "expected #{actual} to be at least #{minimum}"
  end

  failure_message_for_should_not do |actual|
    "expected #{actual} to be less than #{minimum}"
  end

  description do
    "be at least #{minimum}"
  end
end

Now we’re rocking custom failure messages, and test names. This is pretty cool, and honestly how I started out doing matchers. It’s also how I started out doing the matchers for double-blind testing.

The problem is that by skipping the creation of actual matcher classes, we lose the ability to do things like inheritance. Not a big deal if our matchers stay simple, but they won’t. Not if we use them as often as we should! I found myself re-defining the same helper methods in each matcher I defined this way.

So let’s see just how daunting a full-fledged custom matcher class really is:

module CustomMatcher  
  class BeAtLeast
    def initialize(minimum)  
      @minimum = minimum
    end  
  
    def matches?(actual)  
      @actual = actual
      @actual >= @minimum
    end  
  
    def failure_message_for_should  
      "expected #{@actual} to be at least #{@minimum}"  
    end  
  
    def failure_message_for_should_not  
      "expected #{@actual} to be less than #{@minimum}"  
    end  
  end  
  
  def be_at_least(expected)  
    BeAtLeast.new(expected)  
  end  
end  

This isn’t so bad! We’re defining a new class, but you can see it doesn’t have to inherit from anything, or use any unholy Ruby voodoo to work.

We just have to define four methods: initialize, match? (which returns true or false), and the two failure message methods. Along the way, we set some instance variables so we can access the data when we need it. Finally, we define a method that creates a new instance of this class, and that’s what RSpec will rely on.

You can add as many other methods as these four will rely on. But you also get other benefits over the DSL. You can use inheritance, moving common methods up the chain so you only have to define them once, instead of in each matcher definition. You can also write setup/teardown code in your parent classes, make default arguments a breeze, and standardize any error handling. I do all of these in the matchers I created for the example app.

The bottom line is this: defining your own matcher classes directly really DRY’s up your matchers, and that always makes life simpler. I think it’s the only way to go for serious and heavy RSpec users. It allows the class for my validate_presence_of matcher to be this short and sweet:

module DoubleBlindMatchers
  class ValidatePresenceOf < ValidationMatcher
    def default_options
      {:message => "can't be blank", :with => 'x'}
    end

    def match
      set_to @options[:with]
      @object.valid?
      check !@object.errors[@attribute].include?(@options[:message]), shouldnt_exist
      
      set_to nil
      check !@object.valid?, valid_when('nil')
      check @object.errors[@attribute].include?(@options[:message])
      
      set_to ""
      check !@object.valid?, valid_when("blank")
      check @object.errors[@attribute].include?(@options[:message])
    end
  end
  
  def validate_presence_of expected, options = {}
    ValidatePresenceOf.new expected, options
  end
end

And the Teacher model, which grew considerably during our double-blind testing, now looks like this (in its entirety):

# spec/models/teacher_spec.rb

require 'spec_helper'

describe Teacher do
  it {should have_many :subjects}
  
  it {should validate_presence_of :name}
  it {should validate_length_of :name, :maximum => 50, :message => "must be 50 characters or less"}
  
  it {should validate_presence_of :salary}
  it {should validate_numericality_of :salary, :within => (20_000..100_000), :message => "must be between $20K and $100K"}
end

Summary

Now that you’ve seen my entire proposal for double-blind testing, let me know what you think. Be cruel if you must, it’s the only way I’ll learn. I’ll do the best to explain (not defend) my reasoning, and keep an open mind to changes.

I’ll also be publishing my double-blind matchers as a gem so you can add them to your project.

Double-Blind Test-Driven Development in Rails 3: Part 2

February 1, 2011

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

The last article in this series defined the concept of double-blind test-driven development, but didn’t get much into real-world examples. In this article, we’ll explore several such examples.

The Example Application

This article includes a sample app that you can download using the link above. Be sure to checkout tag “double_blind_tests” to see the code as it appears in this article. The next article will have a lot of refactoring. I limited my samples to the model layer, where 100% coverage is a very realistic goal, and this is likely to be the greatest benefit.

I chose a simple high school scheduling app with teachers, the subjects they teach, students, and courses. In this case, I’m defining a course as a student’s participation in a subject. Teachers teach (ie, have) many subjects. Students take (have) many subjects, via courses. The course record contains that student’s grade for the given subject.

The database constraints are intentionally strict, and most of the validations in the models ensure that these constraints are respected in the application layer. We don’t want the user seeing an error page because of bad data. Depending on the application, that can be worse than actually having bad data creep in.

Associations

Here’s an example of a has_many association:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  it "has many subjects" do
    teacher = Factory.create :teacher
    teacher.subjects.should be_empty

    subject = teacher.subjects.create Factory.attributes_for(:subject)
    teacher.subjects.should include(subject)
  end
end

In order to factor out our own assumptions, we have to ask what they are. The assumption is that the subject we add to the teacher’s subject list works because of the has_many relationship. So we’ll first test that teacher.subjects is, in fact, empty when we assume it would be. Then we’re free to test that adding a subject works as we expect.

Here’s a belongs_to association:

# excerpt from spec/models/subject_spec.rb

describe Subject do
  it "belongs_to a teacher" do
    teacher = Factory.create :teacher

    subject = Subject.new
    subject.teacher.should be_nil
    
    subject.teacher = teacher
    subject.teacher.should == teacher
  end
end

Again, we’re challenging the assumption that the association is nil by default, by testing against it before verifying that we can add a teacher. This tests that this is a true belongs_to association, and not simply an instance method. This is the kind of thing that can and will change over the life of an application.

Validations

Let’s test validates_presence_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "name" do
    it "is present" do
      error_message = "can't be blank"
      
      teacher = Teacher.new :name => 'Joe Example'
      teacher.valid?
      teacher.errors[:name].should_not include(error_message)

      teacher.name = nil
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)

      teacher.name = ''
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)
    end
  end
end

This example was actually explained in detail in the last article. Validate that the error doesn’t already exist before trying to trigger it. Don’t just test the default value when you create a blank object, test the likely possibilities. Refactor the error message to DRY up the test and add readability. And finally, test by modifying the object you already created (as little as possible) rather than creating a new object from scratch for each part of the test.

A more complex version is needed to validate the presence of an association:

# excerpt from spec/models/subject_spec.rb

describe Subject do
  describe "teacher" do
    it "is present" do
      error_message = "can't be blank"

      teacher = Factory.create(:teacher)
      subject = Factory.create(:subject, :teacher => teacher)
      subject.valid?
      subject.errors[:teacher].should_not include(error_message)
    
      subject.teacher = nil
      subject.should_not be_valid
      subject.errors[:teacher].should include(error_message)
    end
  end
end

While the test is more complex, the code to satisfy it is not:

# excerpt from app/models/subject.rb

validates_presence_of :teacher

testing validates_length_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "name" do
    it "is at most 50 characters" do
      error_message = "must be 50 characters or less"
      
      teacher = Teacher.new :name => 'x' * 50
      teacher.valid?
      teacher.errors[:name].should_not include(error_message)
      
      teacher.name += 'x'
      teacher.should_not be_valid
      teacher.errors[:name].should include(error_message)
    end
  end
end

And here’s the model code that satisfies the test:

# excerpt from app/models/teacher.rb

validates_length_of :name, :maximum => 50, :message => "must be 50 characters or less"

While you can definitely start to see a pattern in validation testing, this introduces a new element. Instead of freshly setting the name attribute to be 51 characters long, we test the valid edge case first and then add *just* enough to make it invalid – one more character.

This does two things: it verifies that our edge case was as “edgy” as it could be, and it makes our test less brittle. If we wanted to change the test to allow up to 100 characters, we’d only have to modify the test name and the initial set value.

validating a number’s range using validates_numericality_of:

# excerpt from spec/models/teacher_spec.rb

describe Teacher do
  describe "salary" do
    it "is at or above $20K" do
      error_message = "must be between $20K and $100K"
      
      teacher = Teacher.new :salary => 20_000
      teacher.valid?
      teacher.errors[:salary].should_not include(error_message)

      teacher.salary -= 0.01
      teacher.should_not be_valid
      teacher.errors[:salary].should include(error_message)
    end

    it "is no more than $100K" do
      error_message = "must be between $20K and $100K"

      teacher = Teacher.new :salary => 100_000
      teacher.valid?
      teacher.errors[:salary].should_not include(error_message)
      
      teacher.salary += 0.01
      teacher.should_not be_valid
      teacher.errors[:salary].should include(error_message)
    end
  end
end

And here’s the code that satisfies the test:

# excerpt from app/models/teacher.rb

validates_numericality_of :salary, :message => "must be between $20K and $100K",
  :greater_than_or_equal_to => 20_000, :less_than_or_equal_to => 100_000

We’re doing the same here as in our testing of name’s length. We’re setting the edge value that’s *just* within the allowed range, then adding or subtracting a penny to make it invalid. I split up the top and bottom edge tests, because it’s better to test as atomically as possible – one limit per test.

Defaults

Another tricky database constraint to test for is a default value:

# excerpt from spec/models/course_spec.rb

describe Course do
  describe "grade_percentage" do
    it "defaults to 1.0" do
      course = Course.new :grade_percentage => nil
      course.grade_percentage.should be_nil
      
      course = Course.new :grade_percentage => ''
      course.grade_percentage.should be_blank
      
      course = Course.new :grade_percentage => 0.95
      course.grade_percentage.should == 0.95
      
      course = Course.new
      course.grade_percentage.should == 1.0
    end
  end
end

In this case, we can’t avoid having to recreate the model from scratch, because the nature of the implementation. There’s no actual code in the model that makes this happen, it’s purely in the database schema. Why should we test it, then? Because we test any behavior we’re going to rely on in the application. The fact that this model behavior is implemented at the database level (and therefore, not purely TDD) is a small inconvenience.

What’s the assumption our double-blind test is verifying in this case? That the value is only set in the absence of other values being explicitly assigned. Testing with nil and blank values verifies that the default doesn’t override them – it only works in the complete absence of any assignment. I also test an arbitrary (but valid) value as the anti-assumption test before finally verifying that the default is setting to the correct value.

Most default tests verify only that the correct default value is set – the double-blind version verifies that it’s acting only as a default value in all cases.

Summary

The point of double-blind testing is bullet-proof tests, that can’t be reasonably thwarted by antagonistic coding – whether that’s your anti-social pairing partner, or yourself several months down the road. The bottom line is this: test all assumptions.

That being said, this is very time consuming, and we can see a ton of repetition even in this small test suite. What we need is a way to get back to speedy testing before our boss/client notices it now takes an hour to implement one validation.*

*Even if you work for a government owned/regulated institution that actually digs that kind of non-agile perversion, you WILL eventually go insane. Even in this small sample app, the voices in my head had to talk me off a building ledge twice.

The answer lies in RSpec matchers, which are easy to implement, and can grow with your application. The benefit is not just speedier development – it’s also consistency across your application. We’ll explore that in the last article of this series.

Double-Blind Test-Driven Development in Rails 3: Part 1

January 31, 2011

This is a three-part series introducing the concept of double-blind test-driven development in Rails. This post defines the concept itself, and lays the groundwork by showing the way tests are more commonly written. The next couple posts will show how to double-blind test various common rails elements, and how to make this added layer of protection automatic and quick.

  1. Simple Tests
  2. Double-Blind Tests
  3. Making it Practical with RSpec Matchers

Looking at a rails application that was built with test-driven development, you might expect to see something like this:

# spec/models/teacher_spec.rb

describe Teacher do
  it "has many subjects" do
    teacher = Factory.create :teacher
    subject = teacher.subjects.create Factory.attributes_for(:subject)

    teacher.subjects.should include(subject)
  end
  
  describe "name" do
    it "is present" do
      teacher = Teacher.new

      teacher.should_not be_valid
      teacher.errors[:name].should include("can't be blank")
    end
    
    it "is at most 50 characters" do
      teacher = Teacher.new :name => 'x' * 51
      
      teacher.should_not be_valid
      teacher.errors[:name].should include("must be 50 characters or less")
    end
  end
end

Truth be told, if you’re seeing this in the wild the app is probably doing pretty good. This level of testing works great during the early stages of an app, when things are simple. But as things grow and/or multiple developers become involved, you need more.

Consider models where the associations and validations stretch into the dozens of lines. The more careful and specific you are about validations, the easier it is to get conflicting or overlapping validations. I actually came up with the concept of double-blind testing while retro-testing models in a client app that previously had no validation specs.

What is Double-Blind Testing?

In the world of scientific studies, you always need a control group. One set of participants gets the latest and greatest new diet pill, while the other gets a placebo. Researchers used to think this was good enough, and probably pretty funny to watch the placebo users rave about their shrinking waistlines. But it turns out studies like this still allowed some bias – as researchers observed the effects, their *own* preconceived notions tainted results. Enter the double-blind study.

In a double-blind study, the researchers themselves are unaware of which participants are in the control group, and which are being tested. Both sides are “blind”. They may have lost funny patient anecdotes, but they gained research reliability.

Applying the Lessons of Double-Blind Studies to Test-Driven Development

As I said, in the early stages of an app the tests I showed above work great, as long as you’re using TDD and the red-green-refactor cycle. This means you write the test, run it, and it fails. Then you write the simplest code that will make the test pass, run the test again, and confirm that it passes. Most testing tools will literally show red or green as you do this. Then, as you start to amass tests, you’re free to refactor your code (abstracting common code into helper methods, changing for readability, etc) and run the tests again at any time. You will see failures if you broke anything. If not, you’ve more or less guaranteed your code refactoring works properly.

The problem comes in when you start changing old code, or adding tests to processes that didn’t initially happen. What I’m calling double-blind testing is this:

each test needs to verify the object’s behavior before testing what changes.

As an example, let’s rewrite one of the tests from above:

# original test

describe "name" do
  it "is present" do
    teacher = Teacher.new

    teacher.should_not be_valid
    teacher.errors[:name].should include("can't be blank")
  end
end
# modified to be double-blind

describe "name" do
  it "is present" do
    error_message = "can't be blank"

    teacher = Teacher.new :name => 'Joe Example'
    teacher.valid?
    teacher.errors[:name].should_not include(error_message)

    teacher.name = nil
    teacher.should_not be_valid
    teacher.errors[:name].should include(error_message)

    teacher.name = ""
    teacher.should_not be_valid
    teacher.errors[:name].should include(error_message)
  end
end

This is the basic pattern for all double-blind testing. We’re not leaving anything to chance. In the original version, we expected our object to be invalid, we treated it as such, and we got the result we expected. Do you see the problem with this?

Here’s an exercise: can you make the original test pass, even though the object validation is not working correctly? There’s actually a style of pair programming that routinely does exactly this. One developer writes the test, and the other writes just enough code to make it pass, with the good-natured intention of tripping up the first developer whenever possible. If you wrote the original test, I could satisfy it by just adding the error message to every record on validation, regardless of whether it’s true! Your test would pass, but the app would fail.

The test is now “double-blind” in the sense that we as testers have factored out our own expectations from the test. In this case, we expect the error message to not be there until we initialize the object a certain way, and this can be bad. It may sound far-fetched or paranoid*, but in large codebases your original tests are often abused in this very way. The “you” that writes new code today is often at odds with the “you” from three months ago that wrote the older code with a different understanding of the problem at hand.

*Plus, everybody knows it’s not paranoia when the world really is out to get you. I’ve discussed this at length with the voices in my head, and they all agree. Except Javier. That guy’s a jerk.

Now that I’ve laid out the justification, let’s take a closer look at how the test changed. The first thing I did was create a version of the object that I believe should NOT trigger the error message. Then I run through two cases that should. You can see right away, I was forced to be more *specific* about what should trigger an error. Instead of just a blank object with no values set, I’ve proactively set the attribute in question to both nil and blank. A key element here is to try to work with the *same* object, modifying between tests, rather than creating a new object each time. My test wouldn’t have been as specific if I’d just recreated a blank Teacher object and run a single validation check.

Also, with the increased code comes the increased chance of typos. We don’t want to DRY test code up too much, because a good rule is to keep your tests are readable (non-abstract) as possible. But I’ve specified the error message at the top of the test, and reused that string over and over. I did this in a way that DRY’s the code and adds readability. You can see at a glance that all three tests are checking for the same error.

Finally, the first time I run the object’s validation, notice I’m not asserting that it should be valid. If I had written teacher.should be_valid on line 8 of the double-blind test, I’d have to take the extra time to make sure every other part of the object was valid. Not only is this time-consuming, it’s very brittle. Any future validations would break this test.

If you use factories often, you may suggest setting it up that way since a factory-generated object should always be valid. Then you could assert validity. However, this only slows down your test suite. it’s enough just to run valid? on the object, which triggers all the validation checks to load up our errors hash.

Summary

I believe this is a new concept – I was already coding most of my tests this way, but it didn’t dawn on me how valuable it was until I started retro-testing previously testless code. The value showed itself right away.

I would love to hear feedback on this – if you think it’s unnecessary (I tend to be very rainman-ish about my testing code) or even detrimental. However, if you think it’s too much work, I ask you to hold your criticism until you’ve read part 3 of this article, where I show how to use your own RSpec matchers to greatly speed this process.

Legacy Database Column Names in Rails 3

January 28, 2011

If you work with legacy databases, you don’t always have the option of changing column names when something conflicts with Ruby or Rails. A very common example is having a column named “class” in one of your tables. Rails *really* doesn’t like this, and like the wife or girlfriend who really hates your new haircut, it will complain at every possible opportunity:

# trying to set the poorly named attribute
ruby-1.9.2-p0 > u = User.new :class => '1995'
NoMethodError: undefined method `columns_hash' for nil:NilClass
# trying to set a different attribute that is only guilty by association
ruby-1.9.2-p0 > u = User.new :name
NoMethodError: undefined method `has_key?' for nil:NilClass
# trying to set the attribute later in the game
ruby-1.9.2-p0 > u = User.new
 => #<User id: nil, name: nil, class: nil, created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class = '1995'
NoMethodError: undefined method `private_method_defined?' for nil:NilClass

Like the aforementioned wife/girlfriend, you’re not going anywhere until this issue is resolved. Luckily, Brian Jones has solved this problem for us with his gem safe_attributes. Rails automatically creates accessors (getter and setter methods) for every attribute in an ActiveRecord model’s table. Trying to override crucial methods like “class” is what gets us into trouble. The safe_attributes gem turns off the creation of any dangerously named attributes.

Just do this:

# app/models/user.rb
class User < ActiveRecord::Base
  bad_attribute_names :class
end

After including the gem in your bundler, pass bad_attribute_names the list of offending column names, and it will keep Rails from trying to generate accessor methods for it. Now, this does come with a caveat: you don’t have those accessors. Let’s try to get/set our :class attribute:

ruby-1.9.2-p0 > u = User.new
 => #<User id: nil, name: nil, class: nil, created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class = '1995'
 => "1995" 
ruby-1.9.2-p0 > u
 => #<User id: nil, name: nil, class: "1995", created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > u.class
 => User(id: integer, name: string, class: string, created_at: datetime, updated_at: datetime) 

The setter still works (I’m guessing that it was still created because there wasn’t a pre-existing “class=” method) and we can verify that the object’s attribute has been properly set. But calling the getter defaults to…well, the default behavior.

The answer is to always use this attribute in the context of a hash. You can send the object a hash of attribute names/values, and that works. This means your controller creating and updating won’t have to change. Methods like new, create, update_attribute, update_attributes, etc will work fine.

If you want to just set the single value (to prevent an immediate save, for example) do it like this:

ruby-1.9.2-p0 > u[:class] = '1996'
 => "1996" 
ruby-1.9.2-p0 > u
 => #<User id: nil, name: nil, class: "1996", created_at: nil, updated_at: nil> 

Basically, you can still set the attribute directly, instead of going through the rails-generated accessors. But we’re still one step away from a complete solution. We want to be able to treat this attribute like any other, and that requires giving it a benign set of accessors (getter and setter methods). One reason to do this is so we can use standard validations on this attribute.

Adding accessors to our model is this simple:

# add to app/models/user.rb

def class_name= value
  self[:class] = value
end
  
def class_name
  self[:class]
end

We’re calling the accessors “class_name”, and now we can use that everywhere instead of the original attribute name. We can use it in forms:

# example, not found in code

<%= f.text_field :class_name %>

Or in validations:

# add to app/models/user.rb

validates_presence_of :class_name

Or when creating a new object:

# example, not found in code

User.create :class_name => 'class of 1995'

If you download the code, these additions are test-driven, meaning I wrote the tests for those methods before writing the methods themselves, to be sure they worked properly. I encourage you to do the same.

Good luck!

Nested Comments in Ruby on Rails, part 2: Controllers and Views

January 26, 2011

  1. The Model Layer
  2. Controllers and Views
 

Part 1 of this series came out exactly 3 months and 3 days ago. Special thanks to a reader named Edward who prodded me to finally add the controllers and views to this.

Going beyond the model layer for nested comments introduces a new programming idiom: recursion. Some ruby developers may not be familiar with it – especially if your experience is mostly web-related, where the need doesn’t come up as often. Recursion in a nutshell is the act of a method calling itself. If you’ve seen Inception, The ability to have dreams within dreams within dreams means those dreams are recursive. If you haven’t seen the movie, think of russian matryoshka dolls. You won’t experience star-studded special effects with the dolls, but you’ll at least get the idea of recursion.

Unlike russian dolls or most of Leo’s recent work, recursion in software is potentially infinite. Practically speaking though, it’s more like the doll thing. After all, a system only has so many resources, and recursion is expensive in this regard – the method must copy itself in memory at each layer, local variables and all. On the plus side, they tend to be lightning fast compared to standard iteration using loops. And in our case, we’ll be hitting the database at each layer. We’ll ignore the dangers in our simple app, though.

Routing

Let’s start with our routing file:

# config/routes.rb
NestedComments::Application.routes.draw do
  resources :comments do
    resources :comments
  end

  resources :posts do
    resources :comments
  end
  
  root :to => 'posts#index'
end

Working backward, we’re making our Posts controller’s index action our default route. That’s just to get the app functional. Next comes something interesting: nesting our comments inside of our posts. Interesting, but boring. Finally, the main event: nesting our comments within our comments!

Before you get too excited and start pulling out your Nana’s childhood russian doll set for comparision, this isn’t true recursion. It’s well documented that nesting resources any more than two layers deep is painful and unnecessary, so think of this as the lamest russian doll ever.

Controllers

First, our Posts controller, which is less exciting:

# app/controllers/posts_controller.rb
class PostsController < ApplicationController
  def index
    @posts = Post.all
  end

  def show
    @post = Post.find(params[:id])
  end

  def new
    @post = Post.new
  end
  
  def create
    @post = Post.new(params[:post])
    
    if @post.save
      redirect_to posts_path, :notice => "Your post was created successfully."
    else
      render :action => :new
    end
  end
end

We’re setting up a pretty standard restful resource here, with a couple actions skipped for simplicity. Now the comments controller (get those dolls ready):

# app/controllers/comments_controller.rb
class CommentsController < ApplicationController
  before_filter :get_parent
  
  def new
    @comment = @parent.comments.build
  end

  def create
    @comment = @parent.comments.build(params[:comment])
    
    if @comment.save
      redirect_to post_path(@comment.post), :notice => 'Thank you for your comment!'
    else
      render :new
    end
  end

  protected
  
  def get_parent
    @parent = Post.find_by_id(params[:post_id]) if params[:post_id]
    @parent = Comment.find_by_id(params[:comment_id]) if params[:comment_id]
    
    redirect_to root_path unless defined?(@parent)
  end
end

It’s not much bigger, but there’s a lot going on here! First, since comments are nested, we have to look for a parent. We’re only creating comments in this example, so we only have those related actions. Comments will always be shown on a post page.

The really exciting part is after a successful comment creation. How do we redirect back to the post page? For all we know, this comment could buried down 12 layers of replies. All we really have access to so far is the parent of the object. This necessitates a new model method:

# exerpt from app/models/comment.rb
def post
  return @post if defined?(@post)
  @post = commentable.is_a?(Post) ? commentable : commentable.post
end

Recursive functions are often short and sweet for two reasons: they’re already complex by nature, and adding more code than necessary would make them unmanageable. Also, they’re getting a lot done in just a few lines. In this case, the second line is the key: if “commentable” (the parent object) is a post, return that. Otherwise, call this same method on the parent, which will in turn check if *it* is a Post, and so on.

I could have written it shorter, like this:

def post
  commentable.is_a?(Post) ? commentable : commentable.post
end

In fact, I did at first. But the extra code that checks and sets an instance variable is caching the result. This way, if we call the same method on an object more than once, it stores the result for future use. Remember, recursion can be expensive – especially when the database is involved.

Views

Finally, it’s view time, with one more bit of recursion for fun.

Or post views are standard scaffolding mostly, with the exception of the show view:

# app/views/posts/show.html.erb
<h1><%= @post.title %></h1>

<div class="body">
  <%= @post.body %>  
</div>

<h2>Comments</h2>

<p><%= link_to 'Add a Comment', new_post_comment_path(@post) %></p>

<ul class="comment_list">
  <%= render :partial => 'comments/comment', :collection => @post.comments %>
</ul>

Notice we have the partial app/views/comments/_comment.html.erb. We’re calling this for each of our post’s comments. Nothing too fancy here. Now, for the partial itself:

# app/views/comments/_comment.html.erb
<li class="comment">
  <h3><%= comment.title %></h3>

  <div class="body">
    <%= comment.body %>
  </div>
  
  <p><%= link_to 'Add a Reply', new_comment_comment_path(comment) %></p>
  
  <% unless comment.comments.empty? %>
    <ul class="comment_list">
      <%= render :partial => 'comments/comment', :collection => comment.comments %>
    </ul>
  <% end %>
</li>

This partial is recursive! The comments controller doesn’t have a show method, because we’re never going to view a comment by itself. Instead, the show-like code is in this partial, and at the end it checks to see if *this* comment has comments. If so, it calls the partial again on the whole collection. The end result is a nested, bulleted list of comments. This is not very sexy if you fire up the code yourself, but it’s a great starting point.

Summary

Hopefully this article as done a good job of explaining both recursion, and how to use it to achieve nested comments in your applications. If you’re new to recursion as a concept, haven’t seen Inception, didn’t inherit russian dolls from Nana or receive them as a snazzy graduation present, and my explanation somehow fell short, it’s a well documented programming idiom. There are tons of resources online, so take the time to learn this powerful tool, then learn not to overuse it :)

Please download the code and play with it if you want to learn more – the code is fully test-driven so you can see how that works, which is just as important.

On a final note, I’m tempted to do a follow-up article with ajax and some nicer formatting. Perhaps in 3 months and 3 days…

Dynamic Methods in Ruby with method_missing

December 21, 2010

Make it up as you go

One way Ruby is dynamic is that you can choose how to handle methods that are called, but don’t actually exist. If you have a lot of very similar methods, you can even use this to define them all at once! Ruby does this using the method_missing method, which you override in the classes where you need more dynamic method calling.

ActiveRecord’s dynamic find_all_by methods

Ruby on Rails uses method_missing with ActiveRecord’s find_all_by methods. There is no find_all_by_name method, but if your Person model has a name attribute, you can call Person.find_all_by_name('Bob') and it will return all the records that match that name.

Here’s a very simplified version of how Rails handles find_all_by requests:

class Person < ActiveRecord::Base
  def self.method_missing method_name, *args
    if method_name =~ /^find_all_by_(\w+)$/
      self.all(:conditions => {$1 => args[0]})
    end
  end
end

Using regular expressions, method_missing sees if the method name matches something we expect. It parses out the interesting parts, and uses them to look up the objects we’re searching for. This is a good use case, because the attributes of an ActiveRecord model aren’t known until runtime.

Dynamic methods for dynamic objects outside Rails

We can apply this same technique outside of Rails. Let’s create the world’s most dynamic Ruby class:

# lib/widget.rb
class Widget
  def method_missing sym, *args
    if sym =~ /^(\w+)=$/
      instance_variable_set "@#{$1}", args[0]
    else
      instance_variable_get "@#{sym}"
    end
  end
end

We’ve just created a Widget object that can have any attributes you want to give it. method_missing checks if the called method ends with an equal sign – if so, it assigns the value you passed, to an instance variable with that name. If there’s no equal sign, it tries to get the value of an instance variable by that name:

ruby-1.9.2-p0 > widget = Widget.new
 => #<Widget:0x0000010383f618> 
ruby-1.9.2-p0 > widget.name = 'Bob'
 => "Bob" 
ruby-1.9.2-p0 > widget.age = 30
 => 30 
ruby-1.9.2-p0 > widget.name
 => "Bob" 
ruby-1.9.2-p0 > widget.age
 => 30 

Use method_missing with methods that use blocks

You can also pass blocks to method_missing. Say we have an ActiveRecord model called Person, with name and age attributes. Let’s create something similar to find_all_by that gets the list of matching people, and runs them through the map method. We’ll call it map_by:

# app/models/person.rb
class Person < ActiveRecord::Base
  def self.method_missing method_name, *args, &block
    if method_name =~ /^map_by_(\w+)$/
      list = self.all(:conditions => {$1 => args[0]})
      list.map(&block)
    end
  end
end

If a method is called that can’t be found, method_missing will check to see if it matches our map_by pattern, perform an ActiveRecord search, and push the results through map with the block we supplied.

Now let’s see if it works, by grabbing the names of all people in our database age 30:

ruby-1.9.2-p0 > Person.create :name => 'Bob', :age => 30
 => #<Person id: 2, name: "Bob", age: 30, created_at: "2010-12-21 02:23:57", updated_at: "2010-12-21 02:23:57"> 
ruby-1.9.2-p0 > Person.create :name => 'John', :age => 29
 => #<Person id: 3, name: "John", age: 29, created_at: "2010-12-21 02:24:11", updated_at: "2010-12-21 02:24:11"> 
ruby-1.9.2-p0 > Person.create :name => 'Marsha', :age => 30
 => #<Person id: 4, name: "Marsha", age: 30, created_at: "2010-12-21 02:24:22", updated_at: "2010-12-21 02:24:22"> 
ruby-1.9.2-p0 > Person.map_by_age(30){|person| person.name}
 => ["Bob", "Marsha"] 

It works! Now I’m going to refactor the Person class to make it easier to add more dynamic methods in the future. I’ll even add an each_by handler so we can see multiple dynamic methods in action:

# app/models/person.rb
class Person < ActiveRecord::Base
  class << self
    def method_missing method_name, *args, &block
      case method_name
      when /^map_by_(\w+)$/ then map_by $1, args[0], &block
      when /^each_by(\w+)$/ then each_by $1, args[0], &block
      else super method_name, *args, &block
      end
    end
    
    def map_by attribute, value, &block
      list = self.all(:conditions => {attribute => value})
      list.map(&block)
    end

    def each_by attribute, value, &block
      list = self.all(:conditions => {attribute => value})
      list.each(&block)
    end
  end
end

I’ve done a few things. First, I changed our “if” conditional to a case statement, so that we can add to it in the future, and it will be clean and readable. I also moved the actual map_by code into its own method, for the same reason. And now, method_missing calls its parent method if it doesn’t find a match, to preserve inheritance.

You might also notice that instead of defining self.method_missing and self.map_by, I’ve wrapped these method definitions in a class << self block that essentially does the same thing. I think this is cleaner when you have several class methods.

method_missing can be used in any Ruby class, so long as you can anticipate dynamic methods that the users of your class might need, and preserve the chain of inheritance. This should be used sparingly, when you can cut down on method definitions by defining them dynamically. It’s easy to abuse this, and there is extra overhead involved. But for the right situations, method_missing can create shorter, more readable code.

Memoize Techniques in Ruby and Rails

December 13, 2010

The headline isn’t a typo. If you haven’t heard of “memoizing”, it’s the act of caching the result of a method so that when you call the method in the future, it doesn’t have to do all the processing again. I’ll show you a few different ways to do this, along with the pros and cons of each.

Setup

Let’s setup a sample rails app to play with. I’m using Rails 3, and we’ll generate a simple model to work with:

memoize$ rails g model user first_name:string last_name:string
      invoke  active_record
      create    db/migrate/20101204003605_create_users.rb
      create    app/models/user.rb
      invoke    test_unit
      create      test/unit/user_test.rb
      create      test/fixtures/users.yml
memoize$ rake db:migrate
(in /Users/bellmyer/Desktop/bellmyer/blog/memoize)
==  CreateUsers: migrating ====================================================
-- create_table(:users)
   -> 0.0013s
==  CreateUsers: migrated (0.0014s) ===========================================

The Standard Idiom

You’ve probably seen the simplest form of memoization, whether you called it that or not. Our user model has a first_name and last_name. Let’s say we want to create a full_name method to combine the two:

class User < ActiveRecord::Base
  def full_name
    "#{first_name} #{last_name}"
  end
end

Let’s verify that it works as we expect:

memoize$ rails c
Loading development environment (Rails 3.0.1)
ruby-1.9.2-p0 > user = User.new :first_name => 'Bob', :last_name => 'Smith'
 => #<User id: nil, first_name: "Bob", last_name: "Smith", created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > user.full_name
 => "Bob Smith" 

Great! Now let’s try the simplest form of memoization:

class User < ActiveRecord::Base
  def full_name
    @full_name ||= "#{first_name} #{last_name}"
  end
end

The first time this method is called, @full_name doesn’t exist yet, so the code to the right of ||= is executed. The next time this method is called, the result of the method has already been stored in @full_name, so the method doesn’t need to recalculate it.

A Better Way

This quick-and-dirty method works well for a lot of stuff, but it doesn’t work in all cases. Let’s add another memoized method, and see if you can find the logical flaw:

  def has_full_name?
    @has_full_name ||= (!first_name.blank? && !last_name.blank?)
  end

This method checks to see if the user has both a first and last name. At first, it looks like it will work as well as our first method. But what if the result is false? @has_full_name will be set to false, which means the right side of the equation will be run from scratch each time.

Instead of checking if @has_full_name equates to true or false, we need to check if @has_full_name has been defined, like so:

  def has_full_name?
    return @has_full_name if defined?(@has_full_name)
    @has_full_name = (!first_name.blank? && !last_name.blank?)
  end

Now we’re returning @has_full_name if it exists, and evaluating it otherwise. No more true/false gotchas. This method is more reliable, but it’s not as short and sweet, I’ll admit.

Memoization with Method Arguments

What if we have a method with arguments? We usually want the same input to produce the same output. If we enter the same arguments to a method over and over, we’d expect the same return value. So why not take our memoization a step further, and add caching for methods with arguments?

  def formal_name(salutation='Mr.', suffix=nil)
    @formal_name ||= {}
    return @formal_name[[salutation, suffix]] if @formal_name.has_key?([salutation, suffix])

    @formal_name[[salutation, suffix]] = "#{salutation} #{full_name} #{suffix}"
  end

This is a bit more complicated. Since this methd can be called with different arguments, we initialize a hash to store our cached results. We use Hash’s has_key? method to check if we already have a value for the given arguments. Let’s try it out:

memoize$ rails c
user Loading development environment (Rails 3.0.1)
ruby-1.9.2-p0 > user = User.new :first_name => 'Bob', :last_name => 'Smith'
 => #<User id: nil, first_name: "Bob", last_name: "Smith", created_at: nil, updated_at: nil> 
ruby-1.9.2-p0 > user.formal_name('Mr.', 'Jr.')
 => "Mr. Bob Smith Jr." 
ruby-1.9.2-p0 > user.first_name = 'John'
 => "John" 
ruby-1.9.2-p0 > user.formal_name('Mr.', 'Jr.')
 => "Mr. Bob Smith Jr." 

You know the memoization is working, because I changed the first name, and formal_name still gave us the same answer.

Using Rails’ Memoize Module

Our “better way” is bulletproof, and we’ve even added the ability to handle method arguments. But it’s a lot of work to do this for every method, and it gunks up the readability of the method. We have to wade through the caching code to figure out what the method really does.

If you’re using Rails, 2.2 or later, you can take advantage if its Memoize module to clean this up.
Let’s add a method that uses it:

class User < ActiveRecord::Base
  extend ActiveSupport::Memoizable

  # other methods...
  
  def initials(middle_initial)
    first_name[0] + middle_initial + last_name[0]
  end
  memoize :initials
end

It’s that easy! The memoize method takes care of things for you, but you have to extend your class with the module for it to work. And unlike the other memoization strategies discussed, this uses ActiveSupport, so it’s Rails-specific.

The good news is that if you’re using Rails for your project, you can use Memoizable anywhere. Let’s add a non-activerecord class in our lib folder:

# lib/my_number.rb
class MyNumber
  extend ActiveSupport::Memoizable
  
  attr_accessor :x
  
  def initialize(x)
    @x = x
  end
  
  def plus(y)
    @x + y
  end
  memoize :plus
end

And here’s the proof that it works:

memoize$ rails c
require 'myLoading development environment (Rails 3.0.1)
ruby-1.9.2-p0 > require 'my_number'
 => ["MyNumber"] 
ruby-1.9.2-p0 > num = MyNumber.new(5)
 => #<MyNumber:0x00000104434bf8 @x=5> 
ruby-1.9.2-p0 > num.plus 2
 => 7 
ruby-1.9.2-p0 > num.x = 0
 => 0 
ruby-1.9.2-p0 > num.plus 2
 => 7 

This is a much better solution if you’re using Rails, and I highly recommend it. The less code you type by hand, the less chance of adding a bug somewhere.

Things You Should Know

First, as mentioned above, the Memoizable module is only available in ActiveSupport, part of Rails. But the previous examples are pure Ruby, and can be used in any Ruby project.

You don’t want to use memoization everywhere. Here are some situations where it’s not appropriate:

  • The method is simple, and memoizing it won’t save you much (there is some overhead involved).
  • The method’s output needs to change over the life of its object
  • The method is unlikely to be called again with the same parameters.
  • There are so many combinations of parameters that will be called, it will eat up too much memory to store all the results.

Ruby Enumerable Magic: Aggregates

December 10, 2010

  1. The Basics
  2. Unary Ampersand Operator
  3. Booleans
  4. Filters
  5. New Collections
  6. Aggregates
 

This final article in my series about the Enumerable module details the methods that don’t quite fit elsewhere, but deal with the collection as a whole. As such, they each return just one object, not an array.

inject

If you haven’t used this method before, it’s loads of fun. It’s a cumulative method that uses each item in the collection to form a “final answer”. The classic example is finding the sum of the numbers in an array:

irb(main):001:0> [1, 2, 3].inject{|sum, num| sum += num}
=> 6

Another use might be finding the initials in a name:

irb(main):008:0> ['Jaime', 'Lee', 'Bellmyer'].inject(''){|initials, name| initials += name[0,1]}
=> "JLB"

In this example, I had to pass the starting string (”, or blank) to inject. Otherwise, it takes the first element as a whole, and starts adding onto it. I don’t like this behavior, and I suspect it works this way because inject assumes you’re trying to sum elements in some way. You also need to pass an initial value to inject when you want the result to be a different class than the inputs:

irb(main):009:0> [1, 2, 3].inject(0.0){|sum, i| sum += i.to_f}
=> 6.0

min and max

These methods behave largely like you’d expect. The items in the collection have to have the <=> method defined, just like the sorting methods, since finding the min and max requires sorting. So strings and numbers behave like you’d expect:

irb(main):010:0> ['joshua', 'gabriel', 'jacob'].min
=> "gabriel"
irb(main):011:0> ['joshua', 'gabriel', 'jacob'].max
=> "joshua"
irb(main):012:0> [1,2,3].min
=> 1
irb(main):013:0> [1,2,3].max
=> 3

And you can also pass your own custom sorting block, like you can with the sort method:

irb(main):014:0> ['joshua', 'gabriel', 'jacob'].min{|a,b| a.reverse <=> b.reverse}
=> "joshua"
irb(main):015:0> ['joshua', 'gabriel', 'jacob'].max{|a,b| a.reverse <=> b.reverse}
=> "gabriel"
irb(main):016:0> [1,2,3].min{|a,b| a*(-1) <=> b*(-1)}
=> 3
irb(main):017:0> [1,2,3].max{|a,b| a*(-1) <=> b*(-1)}
=> 1

In conclusion

What a long journey it has been. I’d like to thank myself for completing my longest series of articles, in the most timely manner yet. And I’d like to thank you if you suffered through all of it for the sake of knowledge. Please feel free to ask any questions you might have, and I’ll do my best to answer them promptly.

And thus concludes our two-week look at the Enumerable module.

Ruby Enumerable Magic: New Collections

December 8, 2010

  1. The Basics
  2. Unary Ampersand Operator
  3. Booleans
  4. Filters
  5. New Collections
  6. Aggregates
 

The Enumerable module offers several methods to create a new collection out of an existing one, by applying code to each item in the original collection.

map

This is the most common and straightforward method in the Enumerable module. It takes the original collection, applies the given block to each item within, and returns an array of the results:

irb(main):001:0> ['joshua', 'gabriel', 'jacob'].map{|name| name.capitalize}
=> ["Joshua", "Gabriel", "Jacob"]
irb(main):002:0> ['joshua', 'gabriel', 'jacob'].map(&:capitalize)
=> ["Joshua", "Gabriel", "Jacob"]

As with all Ruby methods that require a block, we can pass a block itself, or use the unary ampersand operator to pass a symbol that will be converted to a block and run on the object.

sort

This method will return the same, unaltered items in the collection, but in a sorted order. It does this using each item’s <=> method. Objects like strings choose to be sorted alphabetically. Number classes choose to sort numerically. The important thing is that the items in the collection must have this method defined.

Now for an example:

irb(main):003:0> ['joshua', 'gabriel', 'jacob'].sort
=> ["gabriel", "jacob", "joshua"]
irb(main):004:0> [3, 1, 5].sort
=> [1, 3, 5]

You can also pass a block that will be used to sort items, if the default sort is not good enough. For instance, we could choose to sort strings alphabetically, but starting from last letter to first:

irb(main):005:0> ['joshua', 'gabriel', 'jacob'].sort{|a,b| a.reverse <=> b.reverse}
=> ["joshua", "jacob", "gabriel"]

sort_by

A shorter way to sort based on the reversed version of strings like we did above, is to use the sort_by method, and passing either a block or using the unary ampersand operator to pass a symbol:

irb(main):007:0> ['joshua', 'gabriel', 'jacob'].sort_by{|name| name.reverse}=> ["joshua", "jacob", "gabriel"]
irb(main):008:0> ['joshua', 'gabriel', 'jacob'].sort_by(&:reverse)
=> ["joshua", "jacob", "gabriel"]

This is like calling map before calling sort. It will sort the results of the map method, not the collection items themselves.

zip

Honestly, I haven’t found a good use case for this method, but I also didn’t know it existed until I did the research for this article. It has nothing to do with compression – a better name for it might have been “zipper”. Have you ever noticed on the highway when two lanes merge, a zipper effect is created? People instinctively take turns getting in line, much like the teeth of a zipper.

This method works in a similar fashion:

irb(main):010:0> [1, 2, 3].zip([4, 5, 6])
=> [[1, 4], [2, 5], [3, 6]]
irb(main):011:0> [1, 2, 3].zip([4, 5, 6], [7, 8, 9])
=> [[1, 4, 7], [2, 5, 8], [3, 6, 9]]

It will take the original collection, and merge in the given array or arrays, as shown above. While I don’t yet know a great use case, I’ll keep my eyes peeled now that I understand how it works.


Follow

Get every new post delivered to your Inbox.