Tuesday, May 14, 2013

Clojure: Testing The Creation Of A Partial Function

I recently refactored some code that takes longs from two different sources to compute one value. The code originally stored the longs and called a function when all of the data arrived. The refactored version partials the data while it's incomplete and executes the partial'd function when all of the data is available. Below is a contrived example of what I'm taking about.

Let's pretend we need a function that will allow us to check whether or not another drink would make us legally drunk in New York City.

The code below stores the current bac and uses the value when legally-drunk? is called.

The following (passing) tests demonstrate that everything works as expected.

This code works without issue, but can also be refactored to store a partial'd function instead of the bac value. Why you would want to do such a thing is outside of the scope of this post, so we'll just assume this is a good refactoring. The code below no longer stores the bac value, and instead stores the pure-legally-drunk? function partial'd with the bac value.

Two of the three of the tests don't change; however, the test that was verifying the state is now broken.

note: The test output has been trimmed and reformatted to avoid horizontal scrolling.

In the output you can see that the test is failing as you'd expect, due to the change in what we're storing. What's broken is obvious, but there's not an obvious solution. Assuming you still want this state based test, how do you verify that you've partial'd the right function with the right value?

The solution is simple, but a bit tricky. As long as you don't find the redef too magical, the following solution allows you to easily verify the function that's being partial'd as well as the arguments.

Those tests all pass, and should provide security that the legally-drunk? and update-bac functions are sufficiently tested. The pure-legally-drunk? function still needs to be tested, but that should be easy since it's a pure function.

Would you want this kind of test? I think that becomes a matter of context and personal preference. Given the various paths through the code the following tests should provide complete coverage.

The above tests make no assumptions about the implementation - they actually pass whether you :use the 'original namespace or the 'refactored namespace. Conversely, the following tests verify each function in isolation and a few of them are very much tied to the implementation.

Both sets of tests would give me confidence that the code works as expected, so choosing which tests to use would become a matter of maintenance cost. I don't think there's anything special about these examples; I think they offer the traditional trade-offs between higher and lower level tests. A specific trade-off that stands out to me is identifying defect localization versus having to update the test when you update the code.

As I mentioned previously, the high-level-expectations work for both the 'original and the 'refactored namespaces. Being able to change the implementation without having to change the test is obviously an advantage of the high level tests. However, when things go wrong, the lower level tests provide better feedback for targeting the issue.

The following code is exactly the same as the code in refactored.clj, except it has a 1 character typo. (it's not necessary to spot the typo, the test output below will show you want it is)

The high level tests give us the following feedback.
failure in (high_level_expectations.clj:14) : expectations.high-level-expectations
  [state (atom {})]
  (update-bac 0.01)
  (legally-drunk? 0.07)))

           expected: true 
                was: false
There's not much in that failure report to point us in the right direction. The unit-level-expectations provide significantly more information, and the details that should make it immediately obvious where the typo is.
failure in (unit_level_expectations.clj:8) : expectations.unit-level-expectations
 {:legally-drunk?* [pure-legally-drunk? 0.04]}
 (with-redefs [state (atom {}) partial vector] (update-bac 0.04)))

           expected: {:legally-drunk?* [# 0.04]} 
                was: {:legally-drunk?** [# 0.04]}
           :legally-drunk?** with val [# 0.04] 
                             is in actual, but not in expected
           :legally-drunk?* with val [# 0.04] 
                            is in expected, but not in actual
The above output points us directly to the extra asterisk in update-bac that caused the failure.

Still, I couldn't honestly tell you which of the above tests that I prefer. This specific example provides a situation where I think you could convincingly argue for either set of tests. However, as the code evolved I would likely choose one path or the other based on:
  • how much 'setup' is required for always using high-level tests?
  • how hard is it to guarantee integration using primarily unit-level tests?
In our examples the high level tests require redef'ing one bit of state. If that grew to a few pieces of state and/or a large increase in the complexity of the state, then I may be forced to move towards more unit-level tests. A rule of thumb I use: If a significant amount of the code within a test is setting up the test context, there's probably a smaller function and a set of associated tests waiting to be extracted.

By definition, the unit-level tests don't test the integration of the various functions. When I'm using unit-level tests, I'll often test the various code paths at the unit level and then have a happy-path high-level test that verifies integration of the various functions. My desire to have more high-level tests increases as the integration complexity increases, and at some point it makes sense to simply convert all of the tests to high-level tests.

If you constantly re-evaluate which tests will be more appropriate and switch when necessary, you'll definitely come out ahead in the long run.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.