Wednesday, February 04, 2009

Thoughts on Developer Testing

This morning I read Joel: From Podcast 38 and it reminded me how immature developers are when it comes to testing. In the entry Joel says:
a lot of people write to me, after reading The Joel Test, to say, "You should have a 13th thing on here: Unit Testing, 100% unit tests of all your code."
At that point my interest is already piqued. Unit Testing 100% of your code is a terrible goal and I'm wondering where Joel is going to go with the entry. Overall I like the the entry (which is really a transcribed discussion), but two things in the entry left me feeling uneasy.
  • Joel doesn't come out and say it, but I got the impression he's ready to throw the baby out with the bath water. Unit testing 100% of your code is a terrible goal, but that doesn't mean unit testing is a bad idea. Unit testing is very helpful, when done in a way that provides a positive return on investment (ROI).
  • Jeff hits it dead on when he says:
    ...what matters is what you deliver to the customer...
    Unfortunately, I think he's missing one reality: Often, teams don't know what will make them more effective at delivering.
I think the underlying problem is: People don't know why they are doing the things they do.

A Painful Path
Say you read Unit Testing Tips: Write Maintainable Unit Tests That Will Save You Time And Tears and decide that Roy has shown you the light. You're going to write all your tests with Roy's suggestions in mind. You get the entire team to read Roy's article and everyone adopts the patterns.

All's well until you start accidently breaking tests that someone else wrote and you can't figure out why. It turns out that some object created in the setup method is causing unexpected failures after your 'minor' change created an unexpected side-effect. So, now you've been burned by setup and you remember the blog entry by Jim Newkirk where he discussed Why you should not use SetUp and TearDown in NUnit. Shit.

You do more research on setup and stumble upon Inline Setup. You can entirely relate and go on a mission to switch all the tests to, since removes the concept of setup entirely.

Everything looks good initially, but then a few constructors start needing more dependencies. Every test creates it's own instance of an object; you moved the object creation out of the setup and into each individual test. So now every test that creates that object needs to be updated. It becomes painful every time you add an argument to a constructor. Shit. Again.

The Source of Your Pain
The problem is, you never asked yourself why. Why are you writing tests in the first place? Each testing practice you've chosen, what value is it providing you?

Your intentions were good. You want to write better software, so you followed some reasonable advice. But, now your life sucks. Your tests aren't providing a positive ROI, and if you keep going down this path you'll inevitably conclude that testing is stupid and it should be abandoned.

Industry Experts
Unfortunately, you can't write better software by blindly following dogma of 'industry experts'.

First of all, I'm not even sure we have any industry experts on developer testing. Rarely do I find consistently valuable advice about testing. Relevance, who employs some of the best developers in the world, used to put 100% code coverage in their contracts. Today, that's gone, and you can find Stu discussing How To Fail With 100% Code Coverage. ObjectMother, which was once praised as brilliant, has now been widely replaced by Test Data Builders. I've definitely written my fair share of stupid ideas. And, the examples go on and on.

We're still figuring this stuff out. All of us.

There may not be experts on developer testing, but there are good ideas around specific contexts. Recognizing that there are smart people with contextually valuable ideas about testing is very liberating. Suddenly you don't need to look for the testing silver-bullet, instead you have various patterns available (some conflicting) that may or may not provide you value based on your working context.

Life would be a lot easier if someone could direct you to the patterns that will work best for you, unfortunately we're not at that level of maturity. It's true that if you pick patterns that don't work well for your context, you definitely wont see positive ROI from testing in the short term. But, you will have gained experience that you can use in the future to be more effective.

It's helpful to remember that there aren't testing silver-bullets, that way you wont get lead down the wrong path when you see someone recommending 100% code coverage or other drastic and often dogmatic approaches to developer testing.

Today's Landscape
Today's testing patterns are like beta software. The patterns have been tested internally, but are rarely proven in the wild. As such, the patterns will sometimes work given the right context, and other times they will shit the bed.

I focus pretty heavily on testing and I've definitely seen my fair-share of test pain. I once joined a team that spent 75% of their time writing tests and 25% of their time delivering features. Not a member of the team was happy with the situation, but the business demanded massively unmaintainable Fit tests.

Of course, we didn't start out spending 75% of our time writing Fit tests. As the project grew in size, so did the effort needed to maintain the Fit tests. That kind of problem creeps up on a team. You start by spending 30% of your time writing tests, but before you know it, the tests are an unmaintainable mess. This is where I think Jeff's comments, with regard to writing tests that enable delivery, fall a bit short. Early on, Fit provided positive ROI. However, eventually, Fit's ROI turned negative. Unfortunately, by then the business demanded a Fit test for every feature delivered. We dug ourselves a hole we couldn't get out of.

The problem wasn't the tool. It was how the process relied on the Fit tests. The developers were required to write and maintain their functional tests using Fit, simply because Fit provided a pretty, business readable output. We should have simply created a nice looking output for our NUnit tests instead. Using Fit hurt, because we were doing it wrong.

The current lack of maturity around developer testing makes it hard to make the right choice when picking testing tools and practices. However, the only way to improve is to keep innovating and maturing the current solutions.

If It Hurts, You're Doing It Wrong
Doing it right is hard. The first step is understanding why you use the patterns you've chosen. I've written before about the importance of context. I can explain, in detail, my reasons for every pattern I use while testing. I've found that having motivating factors for each testing pattern choice is critical for ensuring that testing doesn't hurt.

Being pragmatic about testing patterns also helps. Sometimes your favorite testing pattern wont fit your current project. You'll have to let it go and move on. For example, on my current Java project each test method has a descriptive name. I maintain that (like methods and classes) some tests are descriptive enough that a name is superfluous, but since JUnit doesn't allow me to create anonymous test methods I take the path of least resistance. I could write my own Java testing framework and convince the team to use it, but it would probably hurt. The most productive way to test Java applications is with JUnit, and if I did anything else, I'd be doing it wrong.

I can think of countless examples of people doing it wrong and dismissing the value of a contextually effective testing pattern. The biggest example is fragile mocking. If your mocks are constantly, unexpectedly failing, you're doing something wrong. It's likely that your tests suffer from High Implementation Specification. Your tests might be improved by replacing some mocks with stubs. Or, it's possible that your domain model could be written in a superior way that allowed more state based testing. There's no single right answer, because your context determines the best choice.

Another common pain point in testing is duplicate code. People go to great lengths to remove duplication, often at the expense of readability. Setup methods, contexts, and helper methods are all band-aids for larger problems. The result of these band-aids is tests that are painful to maintain. However, there are other options. In the sensationally named entry Duplicate Code in Your Tests I list 3 techniques that I've found to be vastly superior to setup, contexts and helper methods. If those techniques work for you, that's great. If they don't, don't just shove your trash in setup and call it a day. Look for your own testing innovations that the rest of us may benefit from.

If something hurts, don't look for a solution that hurts slightly less, find something that is a joy to work with. And, share it with the rest of us.

Tests Should Make You More Effective
What characterizes something as 'effective' can vary widely based on your context.

Some software must be correct or people die. This software obviously requires thorough testing. Other software systems are large and need to evolve at a fairly rapid pace. Delivering at a rapid pace while adding features almost always requires a fairly comprehensive test suite, to ensure that regression bugs don't slip in.

Conversely, some software is internal and not mission critical. In that case, unhandled exceptions aren't really a big deal and testing is clearly not as high a priority. Other systems are small and rewritten on a fairly consistent basis, thus spending time on thorough testing is likely a waste. If a system is small, short lived, or less-important, a few high level tests are probably all you'll really need.

All of the example environments and each other type of environment share one common trait: You should always look at your context and see what kind of tests and what level of testing will make you more effective.

Tests Are Tools
The tests are really nothing more than a means to an end. You don't need tests for the sake of having tests, you need malleable software, bullet-proof software, internal software, or some other type of software. Testing is simply another tool that you can use to decrease the amount of time it takes to get your job done.

Testing can help you-
  • Design
  • Protect against regression
  • Achieve sign-off
  • Increase customer interaction
  • Document the system
  • Refactor confidently
  • Ensure the system works correctly
  • ...
When asking how and what you should test, start by thinking about what the goal of your project is. Once you understand your goal, select the tests that will help you achieve your goal. Different goals will definitely warrant using different testing patterns. If you start using a specific testing pattern and it hurts, you're probably using a pattern you don't need, or you've implemented the pattern incorrectly. Remember, we're all still figuring this out, so there's not really patterns that are right; just patterns that are right in a given context.

[Thanks to Jack Bolles, Nat Pryce, Mike Mason, Dan Bodart, Carlos Villela, Martin Fowler, and Darren Hobbs for feedback on this entry]


  1. Anonymous1:07 AM

    Spot on!

    This past year has seen something approaching a cargo cult-like fixation on testing in the Ruby community. We test so that we can deliver a product that does what is intended to do. Testing itself is a means to an end but not the only one nor necessarily even the best one.

  2. Anonymous5:13 AM

    It's refreshing to read such a measured article about developer testing. So many articles about testing take extreme views and encourage polarization within the community rather than mature debate.

    I think it's a very common pattern to "throw the baby out with the bath water" when testing starts causing pain. Your article contains a lot of good advice about how to avoid doing this. Thanks.

  3. "what matters is what you deliver to the customer"

    What Joel and Jeff missed, in their missionary zeal to be super-pragmatic (and let's not forget that both of them actually have shipped some pretty decent stuff) is that some of the stuff they're dismissing, or at least poking fun at, might actually have a positive influence on what gets delivered to the customer. It could be more reliable, it could arrive sooner, it could deliver a better experience. Or not, of course, but it could. What had me muttering (and getting curious looks on the commuter train where I listened to the episode) was that they appeared not to consider that they might not know everything there was to know about making customer happy.

    Neither appeared to understand the distinction between pre-code tests as a design activity and post-code testing as a verification exercise. The "sometimes I know the code will work so I don't write a test" is a trap into which I've fallen uncountable times and mostly they're right.

    I'm no guru and my testing capacities are a long way short of what I'd like. On top of that I'm less than 100% SOLID; I'm distinctly shaky on I and D, for example. But I am at least still open to the notion that there are things I can still learn, even (especially!) after 30 years of software development.

    Maybe Joel and Jeff just aren't old enough to realise how much they don't know yet.

    On the plus side, theirs is the podcast that most often makes me want to join in the conversation, so they're doing something right, even if their development processes could be righter.

  4. If it hurts you may be doing it wrong...or right. You mentioned that testing is hard. I agree with that if you take a disciplined yet pragmatic approach. However, if you've been doing it wrong for years(e.g., it takes 8 weeks to push something to production) then you can bet your ass that doing it right(or even marginally better) is going to hurt. And in a mid-sized or large organization that pain would be enough to kill the organization. Most organizations choose to create more technical debt for themselves than accept that pain.

  5. I wonder why after.. how many? at least 10 years of junit.. testing patterns are still in "beta"?

  6. Travis10:39 PM

    Try thinking about testing as a means of designing software, rather than for ensuring quality. If you think of it that way (API first), your arguments for/against TDD will change.


Note: Only a member of this blog may post a comment.