Wednesday, June 06, 2007

Testing: One assertion per test

Limiting your tests to using one assertion is a controversial topic. I originally stumbled upon the idea on Dave Astels' blog. I liked the style of development that Dave described and decided to give it a try, that was over 2 years ago. Since then I've worked on teams ranging from 4 developers to 16, codebases in Ruby and C#, and project timelines ranging from 3 months to 8. I think it's fair to say I've given the concept plenty of chances to fall down. But, regardless of the variables, the guideline has always remained valuable.

For me, the main motivator for using one assertion per test is the resulting maintainability of the test. Tests that focus on one behavior of the system are almost always easier to write and to comprehend at a later date. I've always been better at understanding through examples, so let's take a look at some tests written to test the PhoneNumber class.

class PhoneNumber
attr_accessor :area_code, :exchange, :station

def initialize(area_code, exchange, station)
@area_code, @exchange, @station = area_code, exchange, station
end
end

class PhoneNumberTest < Test::Unit::TestCase
def test_initialize
number = PhoneNumber.new "212", "555", "1212"
assert_equal "212", number.area_code
assert_equal "555", number.exchange
assert_equal "1212", number.station
end
end

The above code works, but if the PhoneNumber class contained a bug in the initialize method, only the first failing assertion would be reported.

class PhoneNumber
attr_accessor :area_code, :exchange, :station

def initialize(area_code, exchange, station)
area_code, exchange, station = area_code, exchange, station
end
end

class PhoneNumberTest < Test::Unit::TestCase
def test_initialize
number = PhoneNumber.new "212", "555", "1212"
assert_equal "212", number.area_code
assert_equal "555", number.exchange
assert_equal "1212", number.station
end
end

# >> Loaded suite -
# >> Started
# >> F
# >> Finished in 0.006025 seconds.
# >>
# >> 1) Failure:
# >> test_initialize(PhoneNumberTest) [-:14]:
# >> <"212"> expected but was
# >> <nil>.
# >>
# >> 1 tests, 1 assertions, 1 failures, 0 errors

This is the first reason I dislike multiple asserts in one test. In this example it would be easy to notice that all three variables are set incorrectly; however, more often fixing the first failing assertion only leads to finding out what's wrong with the 2nd assertion. I'd rather know the first time I run the suite that 10 things are failing, not that 5 are failing and a few others may or may not be failing.

Another reason I dislike multiple assertions is that it's hard to give a descriptive name if you are testing various behaviors. For example, the error message test_initialize(PhoneNumberTest) [-:14]: <"212"> expected but was <nil> isn't the most descriptive in the world. You can argue that I didn't name my test correctly; however, the test_area_code_exchange_and_station_are_initialized_correctly test doesn't tell me much either. On the other hand, the test_area_code_is_initialized_correctly test tells me exactly what behavior I'm testing (or what behavior is currently wrong when a test fails).

require 'test/unit'

class PhoneNumber
attr_accessor :area_code, :exchange, :station

def initialize(area_code, exchange, station)
area_code, exchange, station = area_code, exchange, station
end
end

class PhoneNumberTest < Test::Unit::TestCase
def test_area_code_is_initialized_correctly
number = PhoneNumber.new "212", "555", "1212"
assert_equal "212", number.area_code
end

def test_exchage_is_initialized_correctly
number = PhoneNumber.new "212", "555", "1212"
assert_equal "555", number.exchange
end

def test_station_is_initialized_correctly
number = PhoneNumber.new "212", "555", "1212"
assert_equal "1212", number.station
end
end

# >> Loaded suite -
# >> Started
# >> FFF
# >> Finished in 0.01048 seconds.
# >>
# >> 1) Failure:
# >> test_area_code_is_initialized_correctly(PhoneNumberTest) [-:14]:
# >> <"212"> expected but was
# >> <nil>.
# >>
# >> 2) Failure:
# >> test_exchage_is_initialized_correctly(PhoneNumberTest) [-:19]:
# >> <"555"> expected but was
# >> <nil>.
# >>
# >> 3) Failure:
# >> test_station_is_initialized_correctly(PhoneNumberTest) [-:24]:
# >> <"1212"> expected but was
# >> <nil>.
# >>
# >> 3 tests, 3 assertions, 3 failures, 0 errors

Testing this way also helps me think critically about my domain model. If I aspire to write tests that contain only one assertion, often the methods of my domain model end up with a single responsibility.

11 comments:

  1. Convincing writeup. I might give this a shot in my next codebase. An added benefit is you don't have to choose between unreadably long text names and thorough descriptions of the expected behavior. test_initialize is basically useless because it doesn't say anything more than the method that the test clearly invokes. cf test_get_show etc.

    ReplyDelete
  2. I trend toward one assertion per test. One exception is asserting the initial state of the target.

    If I want to assert that the initial state of the name and description are empty (or null) for an object, when created, I don't have much reservation putting that into a single test that asserts the initial state of the target.

    ReplyDelete
  3. I like the way the unit testing library in Script.aculo.us works. Instead of failing on the first failed assertion, it continues through the rest of the test and reports all the assertions that failed. This would alleviate the first problem you mentioned.

    ReplyDelete
  4. Sean - even though scriptaculous shows you all of the failed assertions in one example, each one is still bound to the state resulting from the previous one. The problem isn't present in Jay's phone number example, but if you have two assertions like this:

    assert_equal(1, obj.do_something)
    assert_equal(2, obj.do_something_else)

    And "do_something" does something that changes the internal state of obj, then the second assertion might be passing or failing because of that state change, in which case you'd be getting misleading feedback.

    I think its more reliable to keep the assertions isolated from state changes caused by the others, just as xUnit tools (and, of course, RSpec) keep the test methods isolated from state changes resulting from other test methods.

    ReplyDelete
  5. Any particular reason you aren't using the setup method to DRY up your tests?

    def setup
    @number = PhoneNumber.new "212", "555", "1212"
    end

    # Your tests here

    def teardown
    @number = nil
    end

    ReplyDelete
  6. Yes, there are reasons. I'll make my next entry about setup.

    Cheers, Jay

    ReplyDelete
  7. Anonymous10:11 AM

    Do you still use the one assertion per test method when testing functionals? If you have something simple, like a test for on a get to show:

    def test_show_view
    get :show, :id => posts(:basic).id

    assert_equal posts(:basic), assigns(:post)
    assert_template 'show'
    end

    So I took some posts fixtures and sent that in with my get request - asserted the right post was being set, asserted my view. Would you rather write two separate tests?

    ReplyDelete
  8. Jay, why do you say "the test_area_code_exchange_and_station_are_initialized_correctly test doesn't tell me much either"? It seems like that tells you exactly what you want to do, which you can assert all at once with something like this:

    assert_equal %w[212 555 1212], [number.area_code, number.exchange, number.station]

    If I had to call out to different collaborators to get each of those three strings, I'd probably separate out the tests, but in this case it feels like one piece of work. (Hell, you do it in one line.)

    It seems like restricting yourself to one simple assertion per test throws the balance too heavily towards a huge number of tests with repeated setup that assert very little.

    ReplyDelete
  9. Senor Humidor,

    The test_area_code_exchange_and_station_are_initialized_correctly test method doesn't tell me much because when it fails I don't know which of the 3 conditions I'm testing failed.

    Creating a test that asserts very little is the key to creating easily maintainable tests. The less the test contains the easier it is to fix when it breaks. And, since we spend a fair amount of time fixing broken tests, maintainability is a huge deal.

    I could have created a more complex example, but I wanted something that was easy to follow and could still demonstrate the concept.

    Vaya con Dios, Jay

    ReplyDelete
  10. It's true that having tests with fewer assertions in them will make each individual test easier to fix, but I think this begs the question of whether it makes the overall task of getting to a "zero failure in the suite" state more manageable.

    Yes, each single test is easier to fix if it has one assertion so it gives you a certain checkoff-your-to-do-list satisfaction, but given any ten tests with half failing, fixing one test with five assertions or five tests with one assertion should be equally difficult.

    Remember the joke about the guy who ordered a pizza, and the pizza shop guy says "Do you want that cut into six pieces or eight", and the customer says "You'd better make it six -- I don't think I could eat eight!"

    ReplyDelete
  11. I use multiple assertions per test, and here's why:

    1. I'm asserting behavior, not individual lines of code.

    2. I've found it rare that there are two different errors in one test (more likely is that one error causes multiple tests to fail). When there are two different errors in one test, I find the second one the next time I run the test. Since 99% of the time, the error(s) are related to some code I just wrote, when I fix that bug, generally I'm green.

    3. The more tests, the longer they take to run, especially if database access is involved. The longer tests take to run, the more my TDD cycle is interrupted.

    4. I haven't found naming tests to be either that onerous or even that important. For example, say I'm testing GET for a controller. Then that's the name of my test. I don't see the value of repeating the test's assertion in its name.

    5. When code scrolls off the top of the screen, it's harder to read and understand. With single assertions per test, there are at least four times as many lines of code than if they were combined.

    However, experience is the best validation, and if the technique suggested works for you, then it works.

    ReplyDelete

Note: Only a member of this blog may post a comment.