Wednesday, November 19, 2008
Ubiquitous Assertion Syntax
One thing that really bothers me about testing is having various different assertion syntaxes. Take a look at the following JUnit examples. (don't worry, I'll be picking on Ruby and .net frameworks as well)
In both tests I want to verify something; however the two tests use different mechanisms for verification. This adds pain for anyone reading or maintaining the test. When determining what a test is verifying you need to look for assertions as well as expected exceptions in the annotation.
When you start adding behavior based tests the situation gets even worse.
Testing 1.0
3 different tests, 3 different ways to verify your code does what you expect it to. To make matters worse the mock expectations live in the body of the test, so there's no guarantee that when looking for assertions you only need to look at the last few lines of the test. Every time you encounter a test you must spend time looking at the test, top to bottom, and determine what is actually being tested.
The tests above are simple, it's a much bigger problem when a test uses a few different forms of verification.
Okay, it would be terrible to run into a test that bad. You might work with people who are smart enough not to do such a thing, but it's very common to see tests that have both assertions and methods that are mocked. You may not consider mocked methods to be a form of verification, but if one of those methods isn't called as you specified it should be called, your test will fail. If it can cause your test to fail, it's a form of verification in my book.
In Java, there are at least 3 ways to define expected behavior, and often a test mixes more than one. This is not a good situation for a test reader or maintainer.
In the Ruby and .net worlds, the story isn't much better. The state based assertions are specified differently than the behavior based assertions; however, both Ruby and .net have a superior way of handling expected exceptions: assert_raises and Assert.Throws.
Below are similar tests in test/unit with mocha (Ruby), with an example of assert_raises for handling the expected exception.
And, for the .net crowd, here's the NUnit + NMock version of the 3 types of tests. As previously mentioned, NUnit provides an assertion for expecting exceptions, but there's still a mismatch between the state based and behavior based assertions. (full disclosure, I don't have Visual Studio running on my Mac, so if there's a typo, forgive me)
Testing 2.0
In each language there have been steps forward. Both Java and C# have moved in the right direction regarding placement of behavior based expectations. Ruby still suffers from setup placement of mock expectations, but at least the behavior based expectations follow the same assertion syntax that the state based assertions use.
In Java, the Mockito framework represents a step in the right direction, moving the mock expectations to the end of the test.
Likewise, in .net, Rhino Mocks allows you to use "Arrange, Act, Assert Syntax (AAA)". The result of AAA is (hopefully) all of your tests put their assertions at the end.
These steps are good for the big two (Java & C#); however, I'd say they are still a bit too far away from ubiquitous assertion syntax for my taste.
RSpec represents forward progress in the Ruby community.
RSpec does a good job of starting all their assertions with "should", (almost?) to a fault. It's not surprising that RSpec was able to somewhat unify the syntax, since the testing framework provides the ability to write both state based and behavior based tests. Looking at the tests you can focus on scanning for the word "should" when looking for what's being tested.
Unfortunately the unified "should" syntax results in possibly the ugliest assertion ever: lambda {…}.should raise_error(Exception). And, as I previously stated RSpec still suffers from setup placement of the mock expectation "should" methods. Perhaps RSpec could benefit from AAA or some type of test spy (example implementation available on pastie.org).
Next Generation Testing (Testing 3.0?)
Eventually I expect all testing frameworks will follow RSpec's lead and include their own mocking support, though I'm not holding my breath since it's been two years since I originally suggested that this should happen. Today's landscape looks largely the same as it did 2 years ago. When testing frameworks take that next step, the syntax should naturally converge.
In the Ruby world you do have another option, but one with serious risk. I've been looking for ubiquitous assertion syntax for so long that I rolled my own framework in Ruby: expectations.
Expectations standardizes on both the location of your assertions (expectations) and how you express them. The following code shows how you can expect a state based result, an exception, and a behavior based result.
As a result of expectations implementation you can always look at the first line of a test and know exactly what you are testing.
Full disclosure: Before you go and install expectations and start using it for your production application, you need to know one big problem: there's no support for expectations. I'm no longer doing Ruby full-time and no one has stepped up to maintain the project. It's out there, it works, and it's all yours, but it comes with no guarantees.
In the .net and Java worlds, the future of testing looks less... evolved. In the .net world, the ever focused on testing Jim Newkirk has teamed up with Brad Wilson to create xUnit.net. xUnit.net represents evolution in the .net space, but as far as I can tell they haven't done much in the way of addressing ubiquitous assertion syntax. In Java, I don't see any movement towards addressing the issue, but can it even happen without closures (anonymous methods, delegates, whatever)?
I'm surprised that more people aren't bothered by the lack of ubiquitous assertion syntax. Perhaps we have become satisfied with disparity in syntax and the required full test scan.
@Test public void simpleAdd() {
assertEquals(2, 1+1);
}
@Test(expected= IndexOutOfBoundsException.class) public void empty() {
new ArrayList<Object>().get(0);
}In both tests I want to verify something; however the two tests use different mechanisms for verification. This adds pain for anyone reading or maintaining the test. When determining what a test is verifying you need to look for assertions as well as expected exceptions in the annotation.
When you start adding behavior based tests the situation gets even worse.
Mockery context = new Mockery();
@Test public void simpleAdd() {
assertEquals(2, 1+1);
}
@Test(expected= IndexOutOfBoundsException.class) public void empty() {
new ArrayList<Object>().get(0);
}
@Test public void forWorksCorrectly() {
final List<Integer> list = context.mock(List.class);
context.checking(new Expectations() {{
one(list).add(1);
one(list).add(2);
}});
for (int i=1; i < 3; i++) {
list.add(i);
}
}Testing 1.0
3 different tests, 3 different ways to verify your code does what you expect it to. To make matters worse the mock expectations live in the body of the test, so there's no guarantee that when looking for assertions you only need to look at the last few lines of the test. Every time you encounter a test you must spend time looking at the test, top to bottom, and determine what is actually being tested.
The tests above are simple, it's a much bigger problem when a test uses a few different forms of verification.
Mockery context = new Mockery();
@Test(expected= IndexOutOfBoundsException.class) public void terribleTest() {
final List<Integer> original = new ArrayList<Integer>(asList(1, 2));
assertEquals(2, original.size());
final List<Integer> list = context.mock(List.class);
context.checking(new Expectations() {{
one(list).add(1);
one(list).add(2);
}});
for (Integer anOriginal : original) {
list.add(anOriginal);
}
original.clear();
original.get(0);
}Okay, it would be terrible to run into a test that bad. You might work with people who are smart enough not to do such a thing, but it's very common to see tests that have both assertions and methods that are mocked. You may not consider mocked methods to be a form of verification, but if one of those methods isn't called as you specified it should be called, your test will fail. If it can cause your test to fail, it's a form of verification in my book.
In Java, there are at least 3 ways to define expected behavior, and often a test mixes more than one. This is not a good situation for a test reader or maintainer.
In the Ruby and .net worlds, the story isn't much better. The state based assertions are specified differently than the behavior based assertions; however, both Ruby and .net have a superior way of handling expected exceptions: assert_raises and Assert.Throws.
Below are similar tests in test/unit with mocha (Ruby), with an example of assert_raises for handling the expected exception.
assert_equal 2, 1+1
end
assert_raises(NoMethodError) {[].not_a_method }
end
array = []
array.expects(:<<).with(1)
array.expects(:<<).with(2)
[1, 2].each do |number|
array << number
end
endAnd, for the .net crowd, here's the NUnit + NMock version of the 3 types of tests. As previously mentioned, NUnit provides an assertion for expecting exceptions, but there's still a mismatch between the state based and behavior based assertions. (full disclosure, I don't have Visual Studio running on my Mac, so if there's a typo, forgive me)
private Mockery mocks = new Mockery();
[Test]
{
Assert.AreEqual(2, 1+1);
}
[Test]
{
Assert.Throws<ArgumentException>(delegate { throw new ArgumentException() } );
}
[Test]
{
IList list = mocks.NewMock<IList>();
Expect.Once.On(list).Method("Add").With(1);
Expect.Once.On(list).Method("Add").With(2);
for (int i=1; i < 3; i++) {
list.Add(i);
}
mocks.VerifyAllExpectationsHaveBeenMet();
}Testing 2.0
In each language there have been steps forward. Both Java and C# have moved in the right direction regarding placement of behavior based expectations. Ruby still suffers from setup placement of mock expectations, but at least the behavior based expectations follow the same assertion syntax that the state based assertions use.
In Java, the Mockito framework represents a step in the right direction, moving the mock expectations to the end of the test.
@Test public void forRevisited() {
List list = mock(List.class);
for (int i=1; i < 3; i++) {
list.add(i);
}
verify(list).add(1);
verify(list).add(2);
}Likewise, in .net, Rhino Mocks allows you to use "Arrange, Act, Assert Syntax (AAA)". The result of AAA is (hopefully) all of your tests put their assertions at the end.
{
var list = MockRepository.GenerateStub<List>();
for (int i=1; i < 3; i++) {
list.Add(i);
}
list.AssertWasCalled( x => x.Add(1));
list.AssertWasCalled( x => x.Add(2));
}These steps are good for the big two (Java & C#); however, I'd say they are still a bit too far away from ubiquitous assertion syntax for my taste.
RSpec represents forward progress in the Ruby community.
it "should test state" do
2.should == 1+1
end
it "should test an error" do
lambda {[].not_a_method }.should raise_error(NoMethodError)
end
it "should test behavior" do
array = []
array.should_receive(:<<).with(1)
array.should_receive(:<<).with(2)
[1, 2].each do |number|
array << number
end
end
RSpec does a good job of starting all their assertions with "should", (almost?) to a fault. It's not surprising that RSpec was able to somewhat unify the syntax, since the testing framework provides the ability to write both state based and behavior based tests. Looking at the tests you can focus on scanning for the word "should" when looking for what's being tested.
Unfortunately the unified "should" syntax results in possibly the ugliest assertion ever: lambda {…}.should raise_error(Exception). And, as I previously stated RSpec still suffers from setup placement of the mock expectation "should" methods. Perhaps RSpec could benefit from AAA or some type of test spy (example implementation available on pastie.org).
# this doesn't work, but maybe it should....
it "should test behavior" do
array = Spy.on []
[1, 2].each do |number|
array << number
end
array.should have_received << 1
array.should have_received << 2
endNext Generation Testing (Testing 3.0?)
Eventually I expect all testing frameworks will follow RSpec's lead and include their own mocking support, though I'm not holding my breath since it's been two years since I originally suggested that this should happen. Today's landscape looks largely the same as it did 2 years ago. When testing frameworks take that next step, the syntax should naturally converge.
In the Ruby world you do have another option, but one with serious risk. I've been looking for ubiquitous assertion syntax for so long that I rolled my own framework in Ruby: expectations.
Expectations standardizes on both the location of your assertions (expectations) and how you express them. The following code shows how you can expect a state based result, an exception, and a behavior based result.
expect 2 do
1 + 1
end
expect NoMethodError do
[].no_method_error
end
expect Object.new.to.receive(:ping).with(:pong) do |obj|
obj.ping(:pong)
endAs a result of expectations implementation you can always look at the first line of a test and know exactly what you are testing.
Full disclosure: Before you go and install expectations and start using it for your production application, you need to know one big problem: there's no support for expectations. I'm no longer doing Ruby full-time and no one has stepped up to maintain the project. It's out there, it works, and it's all yours, but it comes with no guarantees.
In the .net and Java worlds, the future of testing looks less... evolved. In the .net world, the ever focused on testing Jim Newkirk has teamed up with Brad Wilson to create xUnit.net. xUnit.net represents evolution in the .net space, but as far as I can tell they haven't done much in the way of addressing ubiquitous assertion syntax. In Java, I don't see any movement towards addressing the issue, but can it even happen without closures (anonymous methods, delegates, whatever)?
I'm surprised that more people aren't bothered by the lack of ubiquitous assertion syntax. Perhaps we have become satisfied with disparity in syntax and the required full test scan.
Labels: testing
Tuesday, September 30, 2008
Testing Dynamic Web Applications
At RailsConf Europe, in the Q & A portion of my talk on Functional Testing someone asked what I recommend for testing Javascript.
Ugh. Testing Javascript. Is it possible to recommend something when everything you've seen is terrible? Usually I'm cool with picking the tool that sucks the least, but when it comes to Javascript testing the only words that come to mind are: epic fail.
In the past I've failed in two different ways.
A few days later it occurred to me what the correct answer was -- Don't use Javascript and you won't have to test it.
No, I'm not suggesting that we should all go back to mostly static websites. Static content is fine for some things, but GMail is an obvious example of a site that is better done dynamically.
However, Javascript isn't your only choice for highly dynamic websites.
These days, if I were writing a website that required any dynamic interaction I would absolutely use Flex or Silverlight. I've done Flex, and it was nice to work with, but I must admit that I'm lured to Silverlight because it's going to (or does already?) support Ruby.
I'm not sure what the Silverlight testing story is, but I found Flex (and ActionScript) to be quite testable. The single biggest win (as I've said before) is that I no longer need to do in-browser testing. Removing the browser from the equation is huge. No more IE bugs causing your tests to fail, no long start up times as the browser is run, etc.
Testing with FlexUnit (with it's drawbacks) is an order of magnitude better than any experience I've had testing Javascript.
I'm comfortable saying that I would still use Javascript for trivial features that provided so little business value that they did not warrant testing. However, any features that provide noticeable business value must be tested, and I would move to a RIA solution instead.
In my experience, the benefits of switching to a RIA solution are dramatic, one of the largest being: you no longer need to worry about testing Javascript.
Ugh. Testing Javascript. Is it possible to recommend something when everything you've seen is terrible? Usually I'm cool with picking the tool that sucks the least, but when it comes to Javascript testing the only words that come to mind are: epic fail.
In the past I've failed in two different ways.
- Selenium: Too slow and to brittle for any decent size test suite. There's more on that if you wish.
- Some javascript unit testing framework. I can't even remember the name. The syntax was ugly, the tests weren't easy to write, and the runner didn't integrate with any automated tools we used -- lame.
A few days later it occurred to me what the correct answer was -- Don't use Javascript and you won't have to test it.
No, I'm not suggesting that we should all go back to mostly static websites. Static content is fine for some things, but GMail is an obvious example of a site that is better done dynamically.
However, Javascript isn't your only choice for highly dynamic websites.
These days, if I were writing a website that required any dynamic interaction I would absolutely use Flex or Silverlight. I've done Flex, and it was nice to work with, but I must admit that I'm lured to Silverlight because it's going to (or does already?) support Ruby.
I'm not sure what the Silverlight testing story is, but I found Flex (and ActionScript) to be quite testable. The single biggest win (as I've said before) is that I no longer need to do in-browser testing. Removing the browser from the equation is huge. No more IE bugs causing your tests to fail, no long start up times as the browser is run, etc.
Testing with FlexUnit (with it's drawbacks) is an order of magnitude better than any experience I've had testing Javascript.
I'm comfortable saying that I would still use Javascript for trivial features that provided so little business value that they did not warrant testing. However, any features that provide noticeable business value must be tested, and I would move to a RIA solution instead.
In my experience, the benefits of switching to a RIA solution are dramatic, one of the largest being: you no longer need to worry about testing Javascript.
Labels: flex, javascript, ria, silverlight, testing
Thursday, July 03, 2008
The Immaturity of In Browser Testing
Designing applications that behave the same in several browsers is a miserable job. Unfortunately, it's often a business requirement. If your application needs to behave flawlessly in multiple browsers, In Browser testing is probably a necessary evil.
I tend to use Selenium for In Browser testing; therefore, this entry is written from the perspective of a Selenium user.
Selenium is terrible for several reasons.
When you begin to deviate from the above context, things begin to get problematic.
Selenium is undoubtably a tool that can be used by both developers and testers. The various ways to drive the tool ensure that both less than technical users and very technical users both have options. Selenium is best used for happy path testing because large suites can be both hard to maintain and prohibitively slow. Selenium is an appropriate choice for Javascript heavy applications since the tests run directly in the browser, ensuring expected behavior. Selenium is also helpful for mitigating cross-browser compatibility risks. The write once, run in several browsers model is a powerful one. You should chose Selenium when it can improve your confidence that the highest business value features are working correctly.
Despite it's problems it would be misleading not to mention that it probably is the best solution for In Browser testing, but there is surely room for improvement.
I tend to use Selenium for In Browser testing; therefore, this entry is written from the perspective of a Selenium user.
Selenium is terrible for several reasons.
- There are several ways to drive Selenium, and none of them are particularly mature. Should you use SeleniumRC, Selenium on Rails, the in browser recorder, or some other half baked solutions? I don't have the answer. I've used all 3 of the named solutions and found them all to be problematic. Yes, the problems can be gotten around, but they are there and solving them costs time.
- There are several languages for writing tests. Should you use Java, Ruby, Python, Perl, etc? I have no idea. Having the choice might seem like a good thing -- until the person who was writing the majority of tests leaves and the next person to take on the Selenium suite decides he wants to use another language. The languages are also fairly clunky. I can't help wondering if a better solution would have been to create a DSL specific to the in browser testing space.
- I could have written this entire blog entry before most of the Selenium suites in the world would finish. In Browser testing is almost unacceptably slow. Selenium Grid sets out to solve this problem. So you should use that, right? Not exactly, it's not worth the effort unless you have a large suite, and it requires you to go down the SeleniumRC path, which may or may not be the right choice for you.
- Selenium suites quickly reach the size where their value is not proportionate to the amount of effort the tests require to maintain. Thinking about throwing your suite away? If you do you'll be joining a very large club of developers who decided to dump their Selenium suites. It is very hard to design a large Selenium test suite that provides value. I've heard of several suites that were thrown out and only one suite that was large and the team believed it provided value. I guess there's hope, 1 team managed to get it right.
- Browsers are buggy. While Selenium itself might justify it's value, spending a week figuring out what the latest bug in IE is starts to call in question the value of the Selenium suite. Of course, you can stop testing in IE, since only IE breaks the build, but if you need to deploy to an environment where users will be on IE... you're in a bad spot.
- Selenium is great for verifying that everything works as expected, but when a test breaks you get little information on what the problem is. Since the tests are running at such a high level, it's unlikely that you'll be able to easily identify the majority of defects based on the broken Selenium test. The broken test is a great tip that something is wrong, but you'll likely need to do some digging to figure out exactly what is wrong.
- The only real way to know that your application runs in all browsers is to test it in all browsers. Selenium makes it possible to run the same tests regardless of browser.
- The only way to verify that all the pieces of your application integrate perfectly is to test against the entire application stack. Selenium provides a great tool for simulating user experience.
- Once you make a decision on what version to use and what language to use, writing tests is easy. Getting started with Selenium takes very little time, including time for learning.
- For those less than technical team members, the Selenium recorder can be a great tool for creating tests.
- Selenium also represents a tool that is helpful for both developers and testers.
When you begin to deviate from the above context, things begin to get problematic.
Selenium is undoubtably a tool that can be used by both developers and testers. The various ways to drive the tool ensure that both less than technical users and very technical users both have options. Selenium is best used for happy path testing because large suites can be both hard to maintain and prohibitively slow. Selenium is an appropriate choice for Javascript heavy applications since the tests run directly in the browser, ensuring expected behavior. Selenium is also helpful for mitigating cross-browser compatibility risks. The write once, run in several browsers model is a powerful one. You should chose Selenium when it can improve your confidence that the highest business value features are working correctly.
Despite it's problems it would be misleading not to mention that it probably is the best solution for In Browser testing, but there is surely room for improvement.
Labels: testing, testing immaturity
Monday, June 23, 2008
Flex: Anonymous Classes as Test Stubs
In my previous post, The State of Flex Testing, I stated that I use anonymous objects as stubs. This entry will show how to easily create stubs that stub methods and properties.
If you simply need a stub that has a property, the following code is all you'll need.
The above code creates an object that response with "ready" to the message
That's really it. You can define methods and properties on anonymous objects as you please.
Is it beautiful? Not really. I've considered a few other options would allow you to define methods without the anonymous function, but it's never been so annoying that it was worth spending the time to get the implementation correct. I suspect anyone who put their mind to it could come up with a nice implementation in a short amount of time.
If you simply need a stub that has a property, the following code is all you'll need.
{state: "ready"}The above code creates an object that response with "ready" to the message
state. Adding methods is also easy if you remember that you can create properties that store anonymous functions (just like Javascript). If you need the execute method to return true, the following stub should do the trick.{state: "ready", execute: function(){ return true }}That's really it. You can define methods and properties on anonymous objects as you please.
Is it beautiful? Not really. I've considered a few other options would allow you to define methods without the anonymous function, but it's never been so annoying that it was worth spending the time to get the implementation correct. I suspect anyone who put their mind to it could come up with a nice implementation in a short amount of time.
Flex: The State of Testing
Given my focus on testing, it's not surprising that the most common question I get about Flex is: Is it testable? Of course, the answer is yes otherwise I wouldn't be using it.
Out of the box, it's very similar to early JUnit. Like all XUnit ports, it's based on creating testing classes and defining methods that are tests. There's setup, teardown, and all the usual features you expect from XUnit ports.
There's a FlexUnit swc (library) that assists in creating a test swf file, which runs all the tests in the browser. There's a few blog entries that give details on how to generate an XML file from the test swf, so breaking a build or reporting results is possible. I haven't bothered to go that far, but others have successfully blazed that trail.
In general, testing Flex is not great, but I rarely find myself concerned or unhappy. Here's a few reasons why.
I'm not a fan of manually adding tests to my test suite, but just like Paul Gross, we dynamically create ours. So it's not a problem. We didn't stop there though, we also generate the package and class name; therefore, our test classes stay pretty clean. The code below is an example of an entire test file (the ones that we work with).
Those methods wouldn't do much in isolation, but if you generate the surrounding (tedious) code, they work quite well. Here's the rake task that I use.
As you can see, I generate test files and which are later used as source for the test swf. It's a few extra steps, but I never see those steps, I just write my test methods and run
There are no mocking frameworks that I've seen, so that's kind of a bummer. Though, in practice I haven't found myself reaching for a mock anyway, but that will probably depend on your style. Again, I don't have much behavior in my views, so I don't need rich testing tools. Of course, I'd love to have them if they were available, but I don't find their loss to be a deal breaker. I do reach for stubs fairly often, but anonymous classes seem to work fine for that scenario.
There is one huge win that I'll conclude with. When using HTML & Javascript you have to test in browser to ensure that it works. This isn't true of Flex since there aren't browser issues. So, I can test my application using FlexUnit, and be confident that it just works. Removing the need for a (often slow and fragile) Selenium suite is almost enough to make me ditch HTML & Javascript forever.
Out of the box, it's very similar to early JUnit. Like all XUnit ports, it's based on creating testing classes and defining methods that are tests. There's setup, teardown, and all the usual features you expect from XUnit ports.
There's a FlexUnit swc (library) that assists in creating a test swf file, which runs all the tests in the browser. There's a few blog entries that give details on how to generate an XML file from the test swf, so breaking a build or reporting results is possible. I haven't bothered to go that far, but others have successfully blazed that trail.
In general, testing Flex is not great, but I rarely find myself concerned or unhappy. Here's a few reasons why.
- There's little logic in my views anyway, the current app is Flex front end to Rails resources.
- Anything that's annoying, I move out of my way by generating with some ruby scripts (for example: creating the file that lists all the test cases). -- more on this later
- Anonymous objects in flex have been 'good enough' stubs
I'm not a fan of manually adding tests to my test suite, but just like Paul Gross, we dynamically create ours. So it's not a problem. We didn't stop there though, we also generate the package and class name; therefore, our test classes stay pretty clean. The code below is an example of an entire test file (the ones that we work with).
public :void {
var smartListCriteria:* = new SmartListCriteria();
smartListCriteria.addLimitCriteria({order:"an order"});
assertEquals("an order", smartListCriteria.toXml().sort.order);
}
public :void {
var smartListCriteria:* = new SmartListCriteria();
smartListCriteria.addLimitCriteria({field:"a field"});
assertEquals("a field", smartListCriteria.toXml().sort.field);
}Those methods wouldn't do much in isolation, but if you generate the surrounding (tedious) code, they work quite well. Here's the rake task that I use.
desc "Generate test files"
task :generate_tests do
tests = Dir[File.dirname(__FILE__) + "/../../test/flex/tests/**/*Test.as"].each do |file_path|
file_path = File.expand_path(file_path)
package = File.basename(File.dirname(file_path))
emit_file_path = (File.dirname(file_path) + "/" + File.basename(file_path, ".as") + "Emit.as").gsub(/tests/,"generated_tests")
File.open(emit_file_path, "w") do |file|
file << <<-eot
package generated_tests. {
import flexunit.framework.TestCase;
import flexunit.framework.TestSuite;
import uk.co.company.utils.*;
import uk.co.company.smartLists.*;
import mx.controls.*;
public class extends TestCase {
}
}
eot
end
end
endAs you can see, I generate test files and which are later used as source for the test swf. It's a few extra steps, but I never see those steps, I just write my test methods and run
rake test:flex when I want to execute the tests. The one catch is that I can't run individual tests, which is terrible.There are no mocking frameworks that I've seen, so that's kind of a bummer. Though, in practice I haven't found myself reaching for a mock anyway, but that will probably depend on your style. Again, I don't have much behavior in my views, so I don't need rich testing tools. Of course, I'd love to have them if they were available, but I don't find their loss to be a deal breaker. I do reach for stubs fairly often, but anonymous classes seem to work fine for that scenario.
There is one huge win that I'll conclude with. When using HTML & Javascript you have to test in browser to ensure that it works. This isn't true of Flex since there aren't browser issues. So, I can test my application using FlexUnit, and be confident that it just works. Removing the need for a (often slow and fragile) Selenium suite is almost enough to make me ditch HTML & Javascript forever.
Monday, June 16, 2008
Immaturity of Developer Testing
The ThoughtWorks UK AwayDay was last Saturday. You could over-simplify it as an internal conference with some focus on technology, and extra emphasis on fun. At the last minute one of the presenters cancelled so George Malamidis, Danilo Sato, and I put together a quick session -- Immaturity of Developer Testing.
It's no secret that I'm passionate about testing. The same is true of Danilo and Georege, and several of our colleagues. We thought it would be fun to get everyone in a room, argue a bit about testing, and then bring it all together by pointing out that answers are contextual and the current solutions aren't quite as mature as they are often portrayed. To encourage everyone to speak up and increase the level of honesty we also brought a full bottle of scotch.
We put together 5 sections of content, but we only managed to make it through the first section in our time slot. I'll probably post the other 4 sections in subsequent blog posts, but this entry will focus on the high level topics from the talk and the ideas presented by the audience.
Everyone largely agreed that tests are generally treated as second class citizens. We also noted that test technical debt is rarely addressed as diligently as application technical debt is. In addition, problems with tests are often handled by creating band-aids such as your own test case subclass that hides an underlying problem, testing frameworks that run tests in parallel, etc. To be clear, running tests in parallel is a good thing. However, if you have a long running build because of underlying issues and you solve it by running the tests in parallel.. that's a band-aid, not a solution. The poorly written tests may take 10 minutes right now. If you run the tests in parallel it might take 2 minutes today, but when you are back to 10 minutes you now have ~5 times as many problematic tests. That's not a good position to be in. Don't hide problems with abstractions or hardware, tests are as important as application code.
Another (slightly controversial) topic was the goal of testing. George likes the goal of confidence. I love the confidence that tests give me, but I prefer to focus on Return On Investment (ROI). I think George and I agree in principle, but articulate it differently. We both think that as an industry we've lost a bit of focus. One hundred percent test coverage isn't a valuable goal. Instead it's important to test the code that provides the most business value. Test code must be maintained; therefore, you can't always afford to test everything. Even if you could, no automated test suite can ever replace exploratory testing. Often there are tests that are so problematic that it's not valuable to automate them.
The talk was built on the idea that context is king when talking about testing, but it quickly devolved into people advocating for their favorite frameworks or patterns. I ended up taking a side also, in an attempt to show that it's not as easy as right and wrong. I knew the point of view that some of the audience was taking, but I didn't get the impression that they were accepting the other point of view. We probably spent too much time on a few details, of course, the scotch probably had something to do with that.
I wish we could have gotten back on track, but we ended up running out of time. After the talk several people said they enjoyed it quite a bit, and a few people said they began to see the opposing points of view. I think it was a good thing overall, but it's also clear to me that some people still think there are absolute correct and incorrect answers... which is a shame.
Next up, pro and con lists for browser based testing tools, XUnit, anonymous tests, behavior driven development, and synthesized testing.
It's no secret that I'm passionate about testing. The same is true of Danilo and Georege, and several of our colleagues. We thought it would be fun to get everyone in a room, argue a bit about testing, and then bring it all together by pointing out that answers are contextual and the current solutions aren't quite as mature as they are often portrayed. To encourage everyone to speak up and increase the level of honesty we also brought a full bottle of scotch.
We put together 5 sections of content, but we only managed to make it through the first section in our time slot. I'll probably post the other 4 sections in subsequent blog posts, but this entry will focus on the high level topics from the talk and the ideas presented by the audience.
Everyone largely agreed that tests are generally treated as second class citizens. We also noted that test technical debt is rarely addressed as diligently as application technical debt is. In addition, problems with tests are often handled by creating band-aids such as your own test case subclass that hides an underlying problem, testing frameworks that run tests in parallel, etc. To be clear, running tests in parallel is a good thing. However, if you have a long running build because of underlying issues and you solve it by running the tests in parallel.. that's a band-aid, not a solution. The poorly written tests may take 10 minutes right now. If you run the tests in parallel it might take 2 minutes today, but when you are back to 10 minutes you now have ~5 times as many problematic tests. That's not a good position to be in. Don't hide problems with abstractions or hardware, tests are as important as application code.
Another (slightly controversial) topic was the goal of testing. George likes the goal of confidence. I love the confidence that tests give me, but I prefer to focus on Return On Investment (ROI). I think George and I agree in principle, but articulate it differently. We both think that as an industry we've lost a bit of focus. One hundred percent test coverage isn't a valuable goal. Instead it's important to test the code that provides the most business value. Test code must be maintained; therefore, you can't always afford to test everything. Even if you could, no automated test suite can ever replace exploratory testing. Often there are tests that are so problematic that it's not valuable to automate them.
The talk was built on the idea that context is king when talking about testing, but it quickly devolved into people advocating for their favorite frameworks or patterns. I ended up taking a side also, in an attempt to show that it's not as easy as right and wrong. I knew the point of view that some of the audience was taking, but I didn't get the impression that they were accepting the other point of view. We probably spent too much time on a few details, of course, the scotch probably had something to do with that.
I wish we could have gotten back on track, but we ended up running out of time. After the talk several people said they enjoyed it quite a bit, and a few people said they began to see the opposing points of view. I think it was a good thing overall, but it's also clear to me that some people still think there are absolute correct and incorrect answers... which is a shame.
Next up, pro and con lists for browser based testing tools, XUnit, anonymous tests, behavior driven development, and synthesized testing.
Labels: testing, testing immaturity
Tuesday, May 27, 2008
Testing: The Value of Test Names
Test names are glorified comments.
No matter what form is used a test name is a description, a required comment.
I used to think comments were a really good thing. I always tried to think about why when I was writing a comment instead of how. Having your comments describe why was a best practice.
Then, Martin Fowler and Kent Beck showed me the light.
I used to think test names were a really good thing. I always tried to think about why when I was writing a test name instead of how... I think you see where this is going.
Once I recognized that test names were comments I began to look at them differently. I wondered if all the things I disliked about comments could also be said about test names.
Just like comments, their accuracy seemed to degrade over time. What once explained why is now a distracting bit of history that no longer applies. Sure, better developers should have updated them. Whatever. The problem with test names is that they aren't executable or required. Making a mistake doesn't stop the test suite from running.
While test names can be greatly important, they are never essential to execution. There's no guard to ensure accuracy other than conscience and diligence. One persons essential comment is another persons annoying noise, which complicates things even more. Stale test names make me uneasy, but I wonder if test names have bigger problems. For example, it would be poor form to put the why in the name instead of in the code. [C]omments are there because the code is bad, could the same be said of your test names and their tests?
In fact, the entire section in Refactoring on comments may apply to test names also.
In fact, writing tests without names was fairly easy. I focused very heavily on creating a domain model that allowed me to write tests that required no explanation. Eighty percent of the time I found I could remove a test name by writing better code. Sometimes coming back to a test required a bit of time to see what was going on, but I didn't take this as a failure, instead I took the opportunity convert the code to something more maintainable.
I started sharing the idea of making test names superfluous with friends and got very mixed reviews.
John Hume asks for a test suite with examples of how I put this idea into practice. The first draft of this entry didn't include examples because I think the idea isn't tied to a particular framework, and I didn't want to give the impression that the framework was the important bit. However, I think John is right, the entry is better if I point to examples for those interested in putting the idea into practice.
The expectations.rubyforge.org testing framework is designed with anonymous tests as a core concept. I have used expectations on my last 3 client projects and all the open source projects I still work on. You can browse the following tests for plenty of examples.
John also likes code folding at the test name level, which is obviously not possible without test names. Truthfully, I don't have an answer for this. If you use code folding, you'll need to take that into account.
Pat Farley asks: So when do method names get the ax?
Method names are actually an interesting comparison. The reality is that methods are already available nameless. Blocks, procs, lambdas, closures, anonymous functions, or whatever you prefer to call them are already essentially methods (or functions) that have no name.
Ola Bini probably had the best response when it came to method names.
Dan North and I stumbled onto the idea that test names may be more or less valuable based on your workflow. I find that I never care near as much about the test names as I do about individual tests. Dan seems to use test names to get a high level understanding of how a class works. I'll probably explore this idea in a later post after I do a bit more to mature the thoughts. For now it's sufficient to say, if you rely heavily on test names for understanding, you might not be at a point where dropping them makes sense.
Mike Mason likes test names because they can act as a boundary for what the test should do. The basic rules I always followed were that a test can never contain a "and" or an "or". I think Mike is correct that there is a benefit to requiring people to stick to tests that can be explained in a short sentence. However, I like to think the same thing can be achieved with other self imposed constraints such as limiting each test to One Assertion per Test or One Expectation per Test.
Of course, now I'm suggesting one assertion per test and no test names, so if you aren't ready to adopt both, it might not be time to adopt nameless tests.
Ben Butler-Cole provided some anonymous tests written in Haskell that were quite readable. He made the observation that [T]he terser your language the more likely you will be able to make your tests sufficiently comprehensible. I actually find this to be a key in advocating anonymous tests. The test must be easily readable and maintainable, otherwise the effort necessary to work with a test would warrant a description.
A common thought (that I agree with) is that using anonymous tests is something you'll want to try only if you have a talented team. If you're new to testing or have a fair amount of junior programmers on your team, you should probably stick to conventional thinking. However, if you have a team of mostly masters and journeyman, you may find that anonymous tests result in better tests and a better domain model, which is clearly a good thing.
Thanks to Ben Butler-Cole, Chris Stevenson, Dan North, John Hume, Mike Mason, Ola Bini, Prasanna Pendse, Ricky Lui and anyone else who provided feedback on the rough draft.
# Test::Unit
test "address service recognizes post code WC1H 0HT" # Dust
it "should recognize post code WC1H 0HT" # RSpec
should "recognize post code WC1H 0HT" # ShouldaNo matter what form is used a test name is a description, a required comment.
I used to think comments were a really good thing. I always tried to think about why when I was writing a comment instead of how. Having your comments describe why was a best practice.
Then, Martin Fowler and Kent Beck showed me the light.
Don't worry, we aren't saying that people shouldn't write comments. In our olfactory analogy, comments aren't a bad smell; indeed they are a sweet smell. The reason we mention comments here is that comments often are used as a deodorant. It's surprising how often you look at thickly commented code and notice that the comments are there because the code is bad. -- Fowler & Beck, RefactoringFast forward a few years...
I used to think test names were a really good thing. I always tried to think about why when I was writing a test name instead of how... I think you see where this is going.
Once I recognized that test names were comments I began to look at them differently. I wondered if all the things I disliked about comments could also be said about test names.
Just like comments, their accuracy seemed to degrade over time. What once explained why is now a distracting bit of history that no longer applies. Sure, better developers should have updated them. Whatever. The problem with test names is that they aren't executable or required. Making a mistake doesn't stop the test suite from running.
While test names can be greatly important, they are never essential to execution. There's no guard to ensure accuracy other than conscience and diligence. One persons essential comment is another persons annoying noise, which complicates things even more. Stale test names make me uneasy, but I wonder if test names have bigger problems. For example, it would be poor form to put the why in the name instead of in the code. [C]omments are there because the code is bad, could the same be said of your test names and their tests?
In fact, the entire section in Refactoring on comments may apply to test names also.
Comments lead us to bad code that has all the rotten whiffs we've discussed in [Chapter 3 of Refactoring]. Our first action is to remove the bad smells by refactoring. When we're finished, we often find that the comments are superfluous.Comments lead us to bad code...remove the bad smells by refactoring. When we're finished, we often find that the comments are superfluous. With this in mind I began to toy with the idea that I could write tests that made test names unnecessary.
In fact, writing tests without names was fairly easy. I focused very heavily on creating a domain model that allowed me to write tests that required no explanation. Eighty percent of the time I found I could remove a test name by writing better code. Sometimes coming back to a test required a bit of time to see what was going on, but I didn't take this as a failure, instead I took the opportunity convert the code to something more maintainable.
If you need a comment to explain what a block of code does, try Extract Method. If the method is already extracted but you still need a comment to explain what it does, use Rename Method. If you need to state some rules about the required state of the system, use Introduce Assertion.To be clear, I don't think you should be extracting or renaming methods in your tests, I think test names can indicate that those things are needed in the code you are testing. Conversely, Introduce Assertion could apply directly to your test. Instead of stating your assumptions in the test name, introduce an assertion that verifies an assumption. Or better yet, break the larger test up into several smaller tests.
A good time to use a comment is when you don't know what to do. In addition to describing what is going on, comments can indicate areas in which you aren't sure. A comment is a good place to say why you did something. This kind of information helps future modifiers, especially forgetful ones.I agree with Fowler and Beck, comments are valuable when used correctly. Wrapping legacy code with a few tests may result in tests where you know the results, but you don't quite understand how they are generated. Also, higher level tests often benefit from a brief description of why the test exists. These are situations where comments are truly valuable. However, I found I could handle those situations with a comment, they didn't require a test name.
I started sharing the idea of making test names superfluous with friends and got very mixed reviews.
John Hume asks for a test suite with examples of how I put this idea into practice. The first draft of this entry didn't include examples because I think the idea isn't tied to a particular framework, and I didn't want to give the impression that the framework was the important bit. However, I think John is right, the entry is better if I point to examples for those interested in putting the idea into practice.
The expectations.rubyforge.org testing framework is designed with anonymous tests as a core concept. I have used expectations on my last 3 client projects and all the open source projects I still work on. You can browse the following tests for plenty of examples.
- Expectations tests: http://expectations.rubyforge.org/svn/trunk/test/
- Validatable unit tests: http://validatable.rubyforge.org/svn/trunk/test/unit/
- Arbs tests: http://arbs.rubyforge.org/svn/trunk/test/
John also likes code folding at the test name level, which is obviously not possible without test names. Truthfully, I don't have an answer for this. If you use code folding, you'll need to take that into account.
Pat Farley asks: So when do method names get the ax?
Method names are actually an interesting comparison. The reality is that methods are already available nameless. Blocks, procs, lambdas, closures, anonymous functions, or whatever you prefer to call them are already essentially methods (or functions) that have no name.
Ola Bini probably had the best response when it came to method names.
A method selector is always necessary to refer to the method from other places. Not necessarily a name, but some kind of selector is needed. That's not true for test names, and to a degree it's interesting that the default of having test names comes from the implementation artifact of tests usually being represented as methods. Would we still have test names in all cases if we didn't start designing a test framework based around a language that cripples usage of abstractions other than methods for executable code?Prasanna Pendse pointed out that if you use test names for anything you'll lose that ability. The only usage we could think of was creating documentation from the names (RSpec specdoc).
Dan North and I stumbled onto the idea that test names may be more or less valuable based on your workflow. I find that I never care near as much about the test names as I do about individual tests. Dan seems to use test names to get a high level understanding of how a class works. I'll probably explore this idea in a later post after I do a bit more to mature the thoughts. For now it's sufficient to say, if you rely heavily on test names for understanding, you might not be at a point where dropping them makes sense.
Mike Mason likes test names because they can act as a boundary for what the test should do. The basic rules I always followed were that a test can never contain a "and" or an "or". I think Mike is correct that there is a benefit to requiring people to stick to tests that can be explained in a short sentence. However, I like to think the same thing can be achieved with other self imposed constraints such as limiting each test to One Assertion per Test or One Expectation per Test.
Of course, now I'm suggesting one assertion per test and no test names, so if you aren't ready to adopt both, it might not be time to adopt nameless tests.
Ben Butler-Cole provided some anonymous tests written in Haskell that were quite readable. He made the observation that [T]he terser your language the more likely you will be able to make your tests sufficiently comprehensible. I actually find this to be a key in advocating anonymous tests. The test must be easily readable and maintainable, otherwise the effort necessary to work with a test would warrant a description.
A common thought (that I agree with) is that using anonymous tests is something you'll want to try only if you have a talented team. If you're new to testing or have a fair amount of junior programmers on your team, you should probably stick to conventional thinking. However, if you have a team of mostly masters and journeyman, you may find that anonymous tests result in better tests and a better domain model, which is clearly a good thing.
Thanks to Ben Butler-Cole, Chris Stevenson, Dan North, John Hume, Mike Mason, Ola Bini, Prasanna Pendse, Ricky Lui and anyone else who provided feedback on the rough draft.
Labels: comments, testing, Testing Refactorings