Jay Fields' Thoughts: testing immaturity

Showing posts with label testing immaturity. Show all posts

Wednesday, December 17, 2014

Working Effectively with Unit Tests Official Launch

Today marks the official release release of Working Effectively with Unit Tests. The book is available in various formats:

DRM free pdf, epub, & mobi (Kindle) at http://leanpub.com/wewut

Softcover at http://amzn.com/1503242706

Kindle edition at http://amzn.com/B00QS2HXUO

I’m very happy with the final version. Michael Feathers wrote a great foreword. I incorporated feedback from dozens of people - some that have been friends for years, and some that I’d never previously met. I can’t say enough great things about http://leanpub.com, and I highly recommend it for getting an idea out there and making it easy to get fast feedback.

As far as the softcover edition, I had offers from a few major publishers, but in the end none of them would allow me to continue to sell on leanpub at the same time. I strongly considered caving to the demands of the major publishers, but ultimately the ability to create a high quality softcover and make it available on Amazon was too tempting to pass up.

The feedback has been almost universally positive - the reviews are quite solid on goodreads (http://review.wewut.com). I believe the book provides specific, concise direction for effective Unit Testing, and I hope it helps increase the quality of the unit tests found in the wild.

If you'd like to try before you buy, there's a sample available in pdf format or on the web.

Wednesday, February 04, 2009

Thoughts on Developer Testing

This morning I read Joel: From Podcast 38 and it reminded me how immature developers are when it comes to testing. In the entry Joel says:

a lot of people write to me, after reading The Joel Test, to say, "You should have a 13th thing on here: Unit Testing, 100% unit tests of all your code."

At that point my interest is already piqued. Unit Testing 100% of your code is a terrible goal and I'm wondering where Joel is going to go with the entry. Overall I like the the entry (which is really a transcribed discussion), but two things in the entry left me feeling uneasy.

Joel doesn't come out and say it, but I got the impression he's ready to throw the baby out with the bath water. Unit testing 100% of your code is a terrible goal, but that doesn't mean unit testing is a bad idea. Unit testing is very helpful, when done in a way that provides a positive return on investment (ROI).
Jeff hits it dead on when he says:
...what matters is what you deliver to the customer...
Unfortunately, I think he's missing one reality: Often, teams don't know what will make them more effective at delivering.

I think the underlying problem is: People don't know why they are doing the things they do.

A Painful Path
Say you read Unit Testing Tips: Write Maintainable Unit Tests That Will Save You Time And Tears and decide that Roy has shown you the light. You're going to write all your tests with Roy's suggestions in mind. You get the entire team to read Roy's article and everyone adopts the patterns.

All's well until you start accidently breaking tests that someone else wrote and you can't figure out why. It turns out that some object created in the setup method is causing unexpected failures after your 'minor' change created an unexpected side-effect. So, now you've been burned by setup and you remember the blog entry by Jim Newkirk where he discussed Why you should not use SetUp and TearDown in NUnit. Shit.

You do more research on setup and stumble upon Inline Setup. You can entirely relate and go on a mission to switch all the tests to xUnit.net, since xUnit.net removes the concept of setup entirely.

Everything looks good initially, but then a few constructors start needing more dependencies. Every test creates it's own instance of an object; you moved the object creation out of the setup and into each individual test. So now every test that creates that object needs to be updated. It becomes painful every time you add an argument to a constructor. Shit. Again.

The Source of Your Pain
The problem is, you never asked yourself why. Why are you writing tests in the first place? Each testing practice you've chosen, what value is it providing you?

Your intentions were good. You want to write better software, so you followed some reasonable advice. But, now your life sucks. Your tests aren't providing a positive ROI, and if you keep going down this path you'll inevitably conclude that testing is stupid and it should be abandoned.

Industry Experts
Unfortunately, you can't write better software by blindly following dogma of 'industry experts'.

First of all, I'm not even sure we have any industry experts on developer testing. Rarely do I find consistently valuable advice about testing. Relevance, who employs some of the best developers in the world, used to put 100% code coverage in their contracts. Today, that's gone, and you can find Stu discussing How To Fail With 100% Code Coverage. ObjectMother, which was once praised as brilliant, has now been widely replaced by Test Data Builders. I've definitely written my fair share of stupid ideas. And, the examples go on and on.

We're still figuring this stuff out. All of us.

Enlightenment
There may not be experts on developer testing, but there are good ideas around specific contexts. Recognizing that there are smart people with contextually valuable ideas about testing is very liberating. Suddenly you don't need to look for the testing silver-bullet, instead you have various patterns available (some conflicting) that may or may not provide you value based on your working context.

Life would be a lot easier if someone could direct you to the patterns that will work best for you, unfortunately we're not at that level of maturity. It's true that if you pick patterns that don't work well for your context, you definitely wont see positive ROI from testing in the short term. But, you will have gained experience that you can use in the future to be more effective.

It's helpful to remember that there aren't testing silver-bullets, that way you wont get lead down the wrong path when you see someone recommending 100% code coverage or other drastic and often dogmatic approaches to developer testing.

Today's Landscape
Today's testing patterns are like beta software. The patterns have been tested internally, but are rarely proven in the wild. As such, the patterns will sometimes work given the right context, and other times they will shit the bed.

I focus pretty heavily on testing and I've definitely seen my fair-share of test pain. I once joined a team that spent 75% of their time writing tests and 25% of their time delivering features. Not a member of the team was happy with the situation, but the business demanded massively unmaintainable Fit tests.

Of course, we didn't start out spending 75% of our time writing Fit tests. As the project grew in size, so did the effort needed to maintain the Fit tests. That kind of problem creeps up on a team. You start by spending 30% of your time writing tests, but before you know it, the tests are an unmaintainable mess. This is where I think Jeff's comments, with regard to writing tests that enable delivery, fall a bit short. Early on, Fit provided positive ROI. However, eventually, Fit's ROI turned negative. Unfortunately, by then the business demanded a Fit test for every feature delivered. We dug ourselves a hole we couldn't get out of.

The problem wasn't the tool. It was how the process relied on the Fit tests. The developers were required to write and maintain their functional tests using Fit, simply because Fit provided a pretty, business readable output. We should have simply created a nice looking output for our NUnit tests instead. Using Fit hurt, because we were doing it wrong.

The current lack of maturity around developer testing makes it hard to make the right choice when picking testing tools and practices. However, the only way to improve is to keep innovating and maturing the current solutions.

If It Hurts, You're Doing It Wrong
Doing it right is hard. The first step is understanding why you use the patterns you've chosen. I've written before about the importance of context. I can explain, in detail, my reasons for every pattern I use while testing. I've found that having motivating factors for each testing pattern choice is critical for ensuring that testing doesn't hurt.

Being pragmatic about testing patterns also helps. Sometimes your favorite testing pattern wont fit your current project. You'll have to let it go and move on. For example, on my current Java project each test method has a descriptive name. I maintain that (like methods and classes) some tests are descriptive enough that a name is superfluous, but since JUnit doesn't allow me to create anonymous test methods I take the path of least resistance. I could write my own Java testing framework and convince the team to use it, but it would probably hurt. The most productive way to test Java applications is with JUnit, and if I did anything else, I'd be doing it wrong.

I can think of countless examples of people doing it wrong and dismissing the value of a contextually effective testing pattern. The biggest example is fragile mocking. If your mocks are constantly, unexpectedly failing, you're doing something wrong. It's likely that your tests suffer from High Implementation Specification. Your tests might be improved by replacing some mocks with stubs. Or, it's possible that your domain model could be written in a superior way that allowed more state based testing. There's no single right answer, because your context determines the best choice.

Another common pain point in testing is duplicate code. People go to great lengths to remove duplication, often at the expense of readability. Setup methods, contexts, and helper methods are all band-aids for larger problems. The result of these band-aids is tests that are painful to maintain. However, there are other options. In the sensationally named entry Duplicate Code in Your Tests I list 3 techniques that I've found to be vastly superior to setup, contexts and helper methods. If those techniques work for you, that's great. If they don't, don't just shove your trash in setup and call it a day. Look for your own testing innovations that the rest of us may benefit from.

If something hurts, don't look for a solution that hurts slightly less, find something that is a joy to work with. And, share it with the rest of us.

Tests Should Make You More Effective
What characterizes something as 'effective' can vary widely based on your context.

Some software must be correct or people die. This software obviously requires thorough testing. Other software systems are large and need to evolve at a fairly rapid pace. Delivering at a rapid pace while adding features almost always requires a fairly comprehensive test suite, to ensure that regression bugs don't slip in.

Conversely, some software is internal and not mission critical. In that case, unhandled exceptions aren't really a big deal and testing is clearly not as high a priority. Other systems are small and rewritten on a fairly consistent basis, thus spending time on thorough testing is likely a waste. If a system is small, short lived, or less-important, a few high level tests are probably all you'll really need.

All of the example environments and each other type of environment share one common trait: You should always look at your context and see what kind of tests and what level of testing will make you more effective.

Tests Are Tools
The tests are really nothing more than a means to an end. You don't need tests for the sake of having tests, you need malleable software, bullet-proof software, internal software, or some other type of software. Testing is simply another tool that you can use to decrease the amount of time it takes to get your job done.

Testing can help you-

Design
Protect against regression
Achieve sign-off
Increase customer interaction
Document the system
Refactor confidently
Ensure the system works correctly
...

Conclusion
When asking how and what you should test, start by thinking about what the goal of your project is. Once you understand your goal, select the tests that will help you achieve your goal. Different goals will definitely warrant using different testing patterns. If you start using a specific testing pattern and it hurts, you're probably using a pattern you don't need, or you've implemented the pattern incorrectly. Remember, we're all still figuring this out, so there's not really patterns that are right; just patterns that are right in a given context.

[Thanks to Jack Bolles, Nat Pryce, Mike Mason, Dan Bodart, Carlos Villela, Martin Fowler, and Darren Hobbs for feedback on this entry]

Thursday, July 03, 2008

The Immaturity of In Browser Testing

Designing applications that behave the same in several browsers is a miserable job. Unfortunately, it's often a business requirement. If your application needs to behave flawlessly in multiple browsers, In Browser testing is probably a necessary evil.

I tend to use Selenium for In Browser testing; therefore, this entry is written from the perspective of a Selenium user.

Selenium is terrible for several reasons.

There are several ways to drive Selenium, and none of them are particularly mature. Should you use SeleniumRC, Selenium on Rails, the in browser recorder, or some other half baked solutions? I don't have the answer. I've used all 3 of the named solutions and found them all to be problematic. Yes, the problems can be gotten around, but they are there and solving them costs time.
There are several languages for writing tests. Should you use Java, Ruby, Python, Perl, etc? I have no idea. Having the choice might seem like a good thing -- until the person who was writing the majority of tests leaves and the next person to take on the Selenium suite decides he wants to use another language. The languages are also fairly clunky. I can't help wondering if a better solution would have been to create a DSL specific to the in browser testing space.
I could have written this entire blog entry before most of the Selenium suites in the world would finish. In Browser testing is almost unacceptably slow. Selenium Grid sets out to solve this problem. So you should use that, right? Not exactly, it's not worth the effort unless you have a large suite, and it requires you to go down the SeleniumRC path, which may or may not be the right choice for you.
Selenium suites quickly reach the size where their value is not proportionate to the amount of effort the tests require to maintain. Thinking about throwing your suite away? If you do you'll be joining a very large club of developers who decided to dump their Selenium suites. It is very hard to design a large Selenium test suite that provides value. I've heard of several suites that were thrown out and only one suite that was large and the team believed it provided value. I guess there's hope, 1 team managed to get it right.
Browsers are buggy. While Selenium itself might justify it's value, spending a week figuring out what the latest bug in IE is starts to call in question the value of the Selenium suite. Of course, you can stop testing in IE, since only IE breaks the build, but if you need to deploy to an environment where users will be on IE... you're in a bad spot.
Selenium is great for verifying that everything works as expected, but when a test breaks you get little information on what the problem is. Since the tests are running at such a high level, it's unlikely that you'll be able to easily identify the majority of defects based on the broken Selenium test. The broken test is a great tip that something is wrong, but you'll likely need to do some digging to figure out exactly what is wrong.

Of course, there is another point of view. There are several reasons that Selenium is a good tool.

The only real way to know that your application runs in all browsers is to test it in all browsers. Selenium makes it possible to run the same tests regardless of browser.
The only way to verify that all the pieces of your application integrate perfectly is to test against the entire application stack. Selenium provides a great tool for simulating user experience.
Once you make a decision on what version to use and what language to use, writing tests is easy. Getting started with Selenium takes very little time, including time for learning.
For those less than technical team members, the Selenium recorder can be a great tool for creating tests.
Selenium also represents a tool that is helpful for both developers and testers.

The trick to using Selenium is knowing who (for), what (for), when, and why it's useful. For those that desire concise descriptions -- Selenium is best used by developers or testers when testing the most valuable (to the business) happy paths of a Javascript heavy web application that must function in several browsers.

When you begin to deviate from the above context, things begin to get problematic.

Selenium is undoubtably a tool that can be used by both developers and testers. The various ways to drive the tool ensure that both less than technical users and very technical users both have options. Selenium is best used for happy path testing because large suites can be both hard to maintain and prohibitively slow. Selenium is an appropriate choice for Javascript heavy applications since the tests run directly in the browser, ensuring expected behavior. Selenium is also helpful for mitigating cross-browser compatibility risks. The write once, run in several browsers model is a powerful one. You should chose Selenium when it can improve your confidence that the highest business value features are working correctly.

Despite it's problems it would be misleading not to mention that it probably is the best solution for In Browser testing, but there is surely room for improvement.

Monday, June 16, 2008

Immaturity of Developer Testing

The ThoughtWorks UK AwayDay was last Saturday. You could over-simplify it as an internal conference with some focus on technology, and extra emphasis on fun. At the last minute one of the presenters cancelled so George Malamidis, Danilo Sato, and I put together a quick session -- Immaturity of Developer Testing.

It's no secret that I'm passionate about testing. The same is true of Danilo and Georege, and several of our colleagues. We thought it would be fun to get everyone in a room, argue a bit about testing, and then bring it all together by pointing out that answers are contextual and the current solutions aren't quite as mature as they are often portrayed. To encourage everyone to speak up and increase the level of honesty we also brought a full bottle of scotch.

We put together 5 sections of content, but we only managed to make it through the first section in our time slot. I'll probably post the other 4 sections in subsequent blog posts, but this entry will focus on the high level topics from the talk and the ideas presented by the audience.

Everyone largely agreed that tests are generally treated as second class citizens. We also noted that test technical debt is rarely addressed as diligently as application technical debt is. In addition, problems with tests are often handled by creating band-aids such as your own test case subclass that hides an underlying problem, testing frameworks that run tests in parallel, etc. To be clear, running tests in parallel is a good thing. However, if you have a long running build because of underlying issues and you solve it by running the tests in parallel.. that's a band-aid, not a solution. The poorly written tests may take 10 minutes right now. If you run the tests in parallel it might take 2 minutes today, but when you are back to 10 minutes you now have ~5 times as many problematic tests. That's not a good position to be in. Don't hide problems with abstractions or hardware, tests are as important as application code.

Another (slightly controversial) topic was the goal of testing. George likes the goal of confidence. I love the confidence that tests give me, but I prefer to focus on Return On Investment (ROI). I think George and I agree in principle, but articulate it differently. We both think that as an industry we've lost a bit of focus. One hundred percent test coverage isn't a valuable goal. Instead it's important to test the code that provides the most business value. Test code must be maintained; therefore, you can't always afford to test everything. Even if you could, no automated test suite can ever replace exploratory testing. Often there are tests that are so problematic that it's not valuable to automate them.

The talk was built on the idea that context is king when talking about testing, but it quickly devolved into people advocating for their favorite frameworks or patterns. I ended up taking a side also, in an attempt to show that it's not as easy as right and wrong. I knew the point of view that some of the audience was taking, but I didn't get the impression that they were accepting the other point of view. We probably spent too much time on a few details, of course, the scotch probably had something to do with that.

I wish we could have gotten back on track, but we ended up running out of time. After the talk several people said they enjoyed it quite a bit, and a few people said they began to see the opposing points of view. I think it was a good thing overall, but it's also clear to me that some people still think there are absolute correct and incorrect answers... which is a shame.

Next up, pro and con lists for browser based testing tools, XUnit, anonymous tests, behavior driven development, and synthesized testing.