This is a three-part series introducing the concept of double-blind test-driven development in Rails. This post defines the concept itself, and lays the groundwork by showing the way tests are more commonly written. The next couple posts will show how to double-blind test various common rails elements, and how to make this added layer of protection automatic and quick.
Looking at a rails application that was built with test-driven development, you might expect to see something like this:
# spec/models/teacher_spec.rb describe Teacher do it "has many subjects" do teacher = Factory.create :teacher subject = teacher.subjects.create Factory.attributes_for(:subject) teacher.subjects.should include(subject) end describe "name" do it "is present" do teacher = Teacher.new teacher.should_not be_valid teacher.errors[:name].should include("can't be blank") end it "is at most 50 characters" do teacher = Teacher.new :name => 'x' * 51 teacher.should_not be_valid teacher.errors[:name].should include("must be 50 characters or less") end end end
Truth be told, if you’re seeing this in the wild the app is probably doing pretty good. This level of testing works great during the early stages of an app, when things are simple. But as things grow and/or multiple developers become involved, you need more.
Consider models where the associations and validations stretch into the dozens of lines. The more careful and specific you are about validations, the easier it is to get conflicting or overlapping validations. I actually came up with the concept of double-blind testing while retro-testing models in a client app that previously had no validation specs.
What is Double-Blind Testing?
In the world of scientific studies, you always need a control group. One set of participants gets the latest and greatest new diet pill, while the other gets a placebo. Researchers used to think this was good enough, and probably pretty funny to watch the placebo users rave about their shrinking waistlines. But it turns out studies like this still allowed some bias – as researchers observed the effects, their *own* preconceived notions tainted results. Enter the double-blind study.
In a double-blind study, the researchers themselves are unaware of which participants are in the control group, and which are being tested. Both sides are “blind”. They may have lost funny patient anecdotes, but they gained research reliability.
Applying the Lessons of Double-Blind Studies to Test-Driven Development
As I said, in the early stages of an app the tests I showed above work great, as long as you’re using TDD and the red-green-refactor cycle. This means you write the test, run it, and it fails. Then you write the simplest code that will make the test pass, run the test again, and confirm that it passes. Most testing tools will literally show red or green as you do this. Then, as you start to amass tests, you’re free to refactor your code (abstracting common code into helper methods, changing for readability, etc) and run the tests again at any time. You will see failures if you broke anything. If not, you’ve more or less guaranteed your code refactoring works properly.
The problem comes in when you start changing old code, or adding tests to processes that didn’t initially happen. What I’m calling double-blind testing is this:
each test needs to verify the object’s behavior before testing what changes.
As an example, let’s rewrite one of the tests from above:
# original test describe "name" do it "is present" do teacher = Teacher.new teacher.should_not be_valid teacher.errors[:name].should include("can't be blank") end end
# modified to be double-blind describe "name" do it "is present" do error_message = "can't be blank" teacher = Teacher.new :name => 'Joe Example' teacher.valid? teacher.errors[:name].should_not include(error_message) teacher.name = nil teacher.should_not be_valid teacher.errors[:name].should include(error_message) teacher.name = "" teacher.should_not be_valid teacher.errors[:name].should include(error_message) end end
This is the basic pattern for all double-blind testing. We’re not leaving anything to chance. In the original version, we expected our object to be invalid, we treated it as such, and we got the result we expected. Do you see the problem with this?
Here’s an exercise: can you make the original test pass, even though the object validation is not working correctly? There’s actually a style of pair programming that routinely does exactly this. One developer writes the test, and the other writes just enough code to make it pass, with the good-natured intention of tripping up the first developer whenever possible. If you wrote the original test, I could satisfy it by just adding the error message to every record on validation, regardless of whether it’s true! Your test would pass, but the app would fail.
The test is now “double-blind” in the sense that we as testers have factored out our own expectations from the test. In this case, we expect the error message to not be there until we initialize the object a certain way, and this can be bad. It may sound far-fetched or paranoid*, but in large codebases your original tests are often abused in this very way. The “you” that writes new code today is often at odds with the “you” from three months ago that wrote the older code with a different understanding of the problem at hand.
Now that I’ve laid out the justification, let’s take a closer look at how the test changed. The first thing I did was create a version of the object that I believe should NOT trigger the error message. Then I run through two cases that should. You can see right away, I was forced to be more *specific* about what should trigger an error. Instead of just a blank object with no values set, I’ve proactively set the attribute in question to both nil and blank. A key element here is to try to work with the *same* object, modifying between tests, rather than creating a new object each time. My test wouldn’t have been as specific if I’d just recreated a blank Teacher object and run a single validation check.
Also, with the increased code comes the increased chance of typos. We don’t want to DRY test code up too much, because a good rule is to keep your tests are readable (non-abstract) as possible. But I’ve specified the error message at the top of the test, and reused that string over and over. I did this in a way that DRY’s the code and adds readability. You can see at a glance that all three tests are checking for the same error.
Finally, the first time I run the object’s validation, notice I’m not asserting that it should be valid. If I had written teacher.should be_valid on line 8 of the double-blind test, I’d have to take the extra time to make sure every other part of the object was valid. Not only is this time-consuming, it’s very brittle. Any future validations would break this test.
If you use factories often, you may suggest setting it up that way since a factory-generated object should always be valid. Then you could assert validity. However, this only slows down your test suite. it’s enough just to run valid? on the object, which triggers all the validation checks to load up our errors hash.
I believe this is a new concept – I was already coding most of my tests this way, but it didn’t dawn on me how valuable it was until I started retro-testing previously testless code. The value showed itself right away.
I would love to hear feedback on this – if you think it’s unnecessary (I tend to be very rainman-ish about my testing code) or even detrimental. However, if you think it’s too much work, I ask you to hold your criticism until you’ve read part 3 of this article, where I show how to use your own RSpec matchers to greatly speed this process.