Monday, May 12, 2008

Testing: Duplicate Code in Your Tests

Luke Kanies recently left a comment on Using Stubs to Capture Test Essence:
Now if you would only see the wisdom of applying DRY to your whole code base, rather than just the functional bit. :)
Sadly, DRY has become a philosophy that is dogmatically and blindly applied to every aspect of programming. I'm a big fan of firing silver bullets but if you go that path your number one priority needs to be finding where a solution does not fit. In fact, those who have applied DRY excessively have found that there are situations when DRY may not be advantageous.

Three of the points from the Wikipedia entry relate to test setup code.

In some small-scale contexts, the effort required to design around DRY may be far greater than the effort to maintain two separate copies of the data.
There's no question that designing around DRY for tests adds additional effort. First of all, there's no reason that tests need to run in the context of an object; however, the most common way to use a setup is to create instance variables and reuse those instance variables within a test. The necessity of instance variables creates the requirement that a test must be run within a new object instance for each test.

Ensuring that a setup method only creates the objects that are necessary for each test adds complexity. If all tests don't require the same data, but you create a setup that creates everything for every test, digesting that code is painful and unnecessary. Conversely, if you strictly adhere to only using setup for objects used by all methods you quickly limit the number of tests you can put within one test case. You can break the test case into multiple test cases with different setup methods, but that adds complexity. Consider the situation where you need the foo instance variable for test 1 and 2 and the bar instance variable for test 2 and 3. There's no straightforward way to create foo and bar in a DRY way using a setup method.

Of course, there's also readability impacts to consider. When a test breaks that relies on setup and teardown I have to look in 3 places to get the full picture of what's going on. When a test is broken I want to fix it as quickly as possible, so I prefer looking in one place.

The crux of the issue is whether a test case is one context or the individual tests are small-scale multiple contexts. While a test case is conceptually one object, we rarely concern ourselves with a test cases as a whole. Instead testing is very much about many individual executable pieces of code. In fact, it's a testing anti-pattern to have any tests depend on other tests. We write tests in isolation, fix tests in isolation and often run tests in isolation; therefore, I consider them to be many small independent entities that do completely isolated jobs. The result of those jobs is aggregated in the end, but ultimately the different entities never (or should never) interact. Each test is and runs in it's own context.

Imposing standards aimed at strict adherence to DRY could stifle community involvement in contexts where it is highly valued, such as a wiki.
Tests are often written by one, but maintained by many. Any barrier to readable, reliable, and performant tests should be removed as quickly as possible. As I've previously discussed, abstracting behavior to 3 different places reduces readability, which can reduce collaboration. Additionally, setup methods can cause unexpected behavior for tests that are written by an author other than the creator of the setup method. This results in less reliable tests, another obvious negative.

Human-readable documentation (from code comments to printed manuals) are typically a restatement of something in the code with elaboration and explanation for those who do not have the ability or time to read and internalize the code...While tests aren't code comments or printed manuals they are very much (or should be) Human-readable documentation. They are also restatements of something in the code with elaboration and explanation. In fact the setup and teardown logic is usually nothing more than elaboration and explanation. The idea of setup and teardown is to put the system into a known state before executing an assertion. While you should have the ability to understand setup and teardown, you don't have the time. Sure, you can make the time, but you shouldn't need to.

The problem is that it's much easier to understand how to apply DRY than it is to understand why.

If your goal is to create a readable, reliable, and performant test suite (which it should be) then there are better ways to achieve that goal than by blindly applying DRY.
Where there's smoke, pour gasoline --Scott Conley
Duplicate code is a smell. Setup and teardown are deodorant, but they don't fix the underlying issue. Using a setup method is basically like hiding your junk in the closet. There are better ways.

Pull the data and the behavior out of the setup and teardown methods and move it to the tests, then take a hard look.

Unimportant behavior specification
Setup methods often mock various interactions; however, the ultimate goal of the interactions is a single result. For example, you may have designed a Gateway class that takes a message, builds a request, sends the request and parses the results of the request. A poorly designed Gateway class will require you to stub the request building and sending the request. However, a well designed Gateway class will allow you to stub two methods at most and verify the behavior of the parse method in isolation. The poorly designed Gateway class may require a setup method, the well designed Gateway class will have a simple test that stubs a method or two in a straightforward manner and then focuses on the essence of the behavior under test. If your domain model requires you to specify behavior which is has nothing to do with what you are testing, don't specify that behavior in a setup method, remove the need to specify the behavior.

Loosely couple objects
I've always disliked the ObjectMother pattern. ObjectMother is designed to build complex hierarchies of objects. If you need an ObjectMother to easily test, you probably actually need to take another look at your domain model. Loosely coupled objects can be created without complicated graphs of dependencies. Loosely coupled objects don't require setting up multiple collaborators. Fewer collaborators (or dependencies) result in tests that don't rely on setup methods to create various collaborators.

The idea is to write tests that only specify what's necessary and verify one thing at a time. If you find that you need to specify things that are unrelated to what you are trying to verify, change the domain model so that the unnecessary objects are no longer required.

Solve problems globally or locally, but not between
Sometimes there are things that need to happen before and after every test. For example, running each test in a transaction and rolling back after the test is run is a good thing. You could add this behavior at a test case level, but it's obviously beneficial to apply this behavior to every test in the entire test suite.

There are other cases where it can make sense to solve issues on a global level. For example, I once worked on a codebase that relied heavily on localization. Some tests needed to verify strings that were specified in the localization files. For those tests we originally specified what localization file to use (in each test), but as we used this feature more often we decided that it made sense to set the localization file for the entire test suite.

Putting everything within a test is the best solution for increasing understanding. However, applying behavior to an entire test suite also aides in comprehension. I would expect any team member to understand that reference data is loaded once before the test suite is run, each test is run in a transaction, and localization uses the test localization file.

Understanding the behavior of an entire test suite is valuable and understanding the behavior of a test is valuable, but understanding a test case provides almost no benefit. Rarely do you run a single test case, rarely does an entire test case break, and rarely do you edit an entire test case. Since your primary concerns are the test suite and the tests themselves, I prefer to work on those levels.

I wouldn't go too far with the global behavior approach, and I definitely wouldn't start moving conditional logic into global behavior, but applied judiciously this can be a very valuable technique.

Bringing it all together
The core of the idea is that it should be easy to understand a test. Specifying everything within a test makes understanding it easy; however, that solution isn't always feasible. Setup and teardown are not the solution to this problem. Luckily the techniques above can be used to make specifying everything within a test a realistic and valuable solution.

8 comments:

  1. You beat me to my first point with the mention of transactions. I think thats a prime example of something that would be silly to have in each and every test, although it can definitely be unclear as to whats happening when code like that is defined in a method you don't know exists...

    That being said, while I agree with many of your points, I spent that last two weeks or so going over hundreds of tests across all of the layers of a huge rails app. Some places certainly would have benefitted from the lack of a setup/teardown method--your point of about tests depending on others tests is excellent and very real.

    However, I cringe to think of what some of these tests would look like without setup and teardown. Perhaps it wouldn't be as bad as I'm imagining. Maybe I just need to see it in action to really get it.

    Given that you've brought this up before, and will hopefully again, it might be valuable to take a real life test file and rework things as you'd prefer. For instance, the one assertion per test rule is something I like very much, but also can only wonder how it'd work with some of these older tests I've seen others write. Hard for me to imagine things not ballooning out of control.

    Anyway, thx for the great thoughts and writeup as always, and hope to see some of this in action.

    ReplyDelete
  2. Nice post. You should point out that this is not a new concept. The process originally laid out by Kent Beck in his TDD book was:

    * Red Bar
    * Green Bar
    * Remove duplication (between the test and the code)

    I care a lot about test autonomy though. So given the choice, I'll accept some level of duplication to achieve that -- but it is a balance for sure.

    Cheers!
    Clinton

    ReplyDelete
  3. Could you show an example of the poorly designed Gateway class and its tests that require a setup method, and compare it with a well-designed one whose tests require minimal stubbing?

    ReplyDelete
  4. Call me persuaded but unconvinced. Perhaps it is a lack of clarity about what you are advocating. I have been reading through the Merb-core test suite which uses rspec. The Merb team actually points people to there test suite as a form of documentation, so the think that the tests are readable and easy to follow. On the other hand the Routing tests in particular have a lot of DRY refactoring to them. Much of the common code is refactored into helper methods, which mean that there are several places to look to find why a test is failing.

    It is hard to talk in merely hypothetical terms about this, at least for me. If you have looked at the Merb-core tests, what do you think of them? Are they DRY to the point where they are not helpful, or does this kind of refactoring add to readability and brevity?

    Link to example spec

    ReplyDelete
  5. Interesting post and I have a couple of comments.

    *Loosely couple objects*
    I'm just not sure how practical it is to make it so that a domain model is always incredibly easy to use in tests particularly when often tests do not use the domain objects in the same way as the rest of the system does (e.g. give me a Completed order).

    As an example design wise it may make sense to pass a Customer to an Orders constructor but it makes it harder to use an Order in tests. Also unless I model each state as a seperate class (ActiveOrder entity) I may have to go through multiple steps to get an object ready for use in tests, if I don't use a builder/object mother then this could result in a heck of a lot of duplication.

    Test Fixtures
    I agree about the problems with small focussed test fixtures but you can keep them all in one file or otherwise organize them to make it easier to navigate and I think it also has a lot of advantages in making the tests easier to read and understand. You just look at setup methods (which are short and sweet for unit tests) and the test (if possible just doign the assert). Have to say I like both approaches, which is a total cop out.

    ReplyDelete
  6. Anonymous3:04 PM

    The only place duplication should have in computing is in redundancy. The three strike rule, as it applies to refactoring, and especially refactoring test code, trumps your argument here.


    The problem is that it's much easier to understand how to apply DRY than it is to understand why.


    I couldn't disagree more, this is pure nonsense. How you remove duplication depends wholly on the situation. Why you remove duplication is, in fact, a very easy thing to understand.

    Programming is about representing knowledge in a machine readable format. A computer can easily duplicate knowledge if you tell it to; copying is really cheap for a computer. No matter how sophisticated you make your algorithms, it's ultimately up to a human to tell the computer which piece of knowledge he/she needs. Until computers can read minds, this is the single most important reason to fight duplication.

    The fact that you think the DRY principle means make things more complicated is just surrounding it in FUD. All it says is that knowledge should have a single authoritative representation in a system. There are many ways to keep things simple while removing duplication, and any effort invested here will increase overall maintainability.

    The only difference when you're working with tests is that they don't stand on their own and will change with the code they test. You have to swallow this assumption as soon as you decide to write tests - it's not anything to be scared of. The better you become at writing tests, the more valuable they will become. And anyone will tell you, when working with tests, the hardest problem you're going to come across is maintenance. This is why the best advice I can give is to adhere to the DRY principle when writing tests.

    ReplyDelete
  7. Anonymous:

    I appreciate your opinions. Clearly you are so confident in your assertions that you've left your name... oh.. nevermind.

    Your comment is, to quote you: "pure nonsense"

    I wish testing were as simple as you make it sound; unfortunately, developer testing is highly dependent on context and your context is clearly significantly different than mine.

    I disagree with almost every claim you make, some of which I would also characterize as FUD. But, there's little value in continuing the debate with "unnamed sources" who have incredibly narrow visions of the world. So, I'll chalk it up to, agree to disagree.

    Cheers, Jay

    ReplyDelete
  8. @anonymous "Programming is about representing knowledge in a machine readable format." I this this is the point. Isn't testing supposed to be close to a human readable format? Isn't that why we all love the .should sweetness?

    I agree that maintaining a test suite requires a certain effort, regardless having DRY or explicit code. I think the ultimate question is do you want to maintain DRY test or explicit test? I'd rather have DRY code in my app, and explicit tests.

    Hope to get some feedback from you.

    ReplyDelete

Note: Only a member of this blog may post a comment.