Update at bottom
Update 2 for Saikuro reported cyclomatic complexity
Update 3 for Flog
I despise metaprogrammed tests. The problem with metaprogrammed tests is that they introduce more questions than answers. Tests are supposed to give confidence, but I don't feel very confident when I find myself asking: which assertion failed? what part of the test is wrong? in which loop, at what value, do you think the problem is?
Let's jump straight to an example. The following method on Fixnum will tell you what the letter grade is.
class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end
50.as_letter_grade 60.as_letter_grade 70.as_letter_grade 80.as_letter_grade 90.as_letter_grade
For completeness you may wish to test every value between 0 and 100 to ensure that no mistakes are made. Doing this the most straight forward way possible, you would define 101 tests and test every value individually.
require 'test/unit'
class GradeTests < Test::Unit::TestCase
def test_zero_is_an_f
assert_equal "F", 0.as_letter_grade
end
def test_one_is_an_f
assert_equal "F", 1.as_letter_grade
end
def test_two_is_an_f
assert_equal "F", 2.as_letter_grade
end
end
While this would work it suffers from a few complications: it's too long to digest and it would be painfully tedious to write. You might jump to the conclusion that you ought to metaprogram the tests to resolve the previously mentioned issues.
require 'test/unit'
class GradeTests < Test::Unit::TestCase
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end
This solution isn't so bad at first glance. When a test fails, I can see what number I was working with, what letter I expected and what letter I actually got.
Loaded suite /Users/jay/Desktop/foo
Started
..........................................................
..........F................................
Finished in 0.024512 seconds.
1) Failure:
test_70_is_C:32
<"C"> expected but was
<"D">.
101 tests, 101 assertions, 1 failures, 0 errors
Then I have to actually figure out what is wrong, and this is where I begin to really dislike metaprogrammed tests. The line number is almost worthless. Yes, the loop is on or near that line, but the actual failure isn't found exclusively on that line, it also contains about 100 successful assertions. Also, I always expect the problem to be in the class, but that's not always the case. Metaprogramming in tests is just as susceptable to mistakes as programming the domain. Yet, by instinct we always look there last, because we expect our tests to give us confidence, they should be correct. The example code is easy enough to follow, but most metaprogrammed tests contain more complexity, thus leading to even more fragile and fear instilling tests.
Also, if you find yourself wanting to defend metaprogrammed tests, ask yourself if you usually even provide as many clues as I have. Do you create test names that help you figure out what the problem was? Do you first get the letter and then compare it, or do you assert true and false, yielding even less information. If you don't give me at least as much information as I've given myself in my example, I can't even begin to imagine trying to find out what's wrong with a broken test.
The single largest problem with metaprogrammed tests is that they've unnecessarily added complexity to your test suite. This complexity reduces the maintainability of tests, ensuring that they are less likely to be maintained.
There is a better way.
You can approach the problem differently and still provide a concise solution. Looking at our issue another way, we simply want to test that certain values return A, B, C, D, or F. To me, that appears like I need 5 different tests, not 101. Here's what I consider to be a more maintainable solution.
class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_equal ["A"], (90..100).collect { |int| int.as_letter_grade }.uniq
end
def test_numbers_that_are_Bs
assert_equal ["B"], (80..89).collect { |int| int.as_letter_grade }.uniq
end
end
The above tests should be readable to anyone very quickly. They correctly provide the line number of a failing test when a test fails. Also, each test verifies only one piece of logic, greatly reducing complexity. Lastly, I can easily see in the test that it's written correctly, so any errors must be resulting from a mistake in the domain.
These tests instill more confidence and they are easier to digest and therefore maintain. These are tests that are more likely to live on and provide value. These are tests I thank my teammates for.
UpdateTammer Saleh correctly points out that the failure message for my last example would actually be worse than the failure message from the metaprogrammed tests. I was aware of that fact when I wrote up the entry, but I was unsure how to address the issue. If I were on a project I would write a custom assertion for
expectations that would give me a descriptive error message while also allowing me to easily test what I want. That custom assertion would be well tested and could be designed to be general enough to apply across my entire test suite, thus infinitely more valuable than metaprogramming that only solves a problem for a specific test.
But, this isn't a project, it's an example. Still, I failed, I didn't give the complete answer. This is my attempt to resolve that situation. As I said, on a project I would use expectations, but for the purpose of this entry, I'll provide a custom assertion that could be easily used with test/unit.
The general solution is that I have an enumerable object and I want to verify the result of calling a method on each element of the enumerable. Thus, I should be able to create a general assertion that takes my expected single result, the enumerable, and the block that should be executed on each element. If all elements return actual results that match the expected value then the test passes. However, if any element does not return the expected value, then the expected value, the actual value, and the element are all described in the error message. The error message will contain all failures, not just the first one that fails.
Below is the code in full, but the following code would not be enough if this were a real project. Instead, if this were a real project this custom assertion should be tested with the same amount of effort that you put into testing any domain concept.
class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end
require 'test/unit'
class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end
def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end
end
class Test::Unit::TestCase
def assert_enumerable_only_returns(expected, enumerable, &block)
messages = enumerable.inject([]) do |result, element|
actual = element.instance_eval(&block)
result << "<#{expected}> expected but was <#{actual}> for #{element}" if expected != actual
result
end
assert_block(messages.join("\n")) { messages.empty? }
end
end
Additionally, here's the results from a failing test.
class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 78..89) { |int| int.as_letter_grade }
end
end
I would take this solution over any metaprogrammed solution I can think of.
Update 2I decided to check out what the cyclomatic complexity would look like for defining tests in a loop compared to traditional definitions with custom assertions. I used
Saikuro to give me cyclomatic complexity results.
Interestingly, the complexity of the looping test definition (8) is more than the complexity of the logic added to Fixnum (6). It's also double the complexity of the custom assertion version (4) of the tests. The custom assertion also registers a score of 4, but that doesn't concern me since I'll test the custom assertion.
For those interested in running the experiment the code I used can be found below. I defined a class method and called it explicitly because Saikuro reports complexity on a method basis, so I needed a method for it measure.
class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end
require 'test/unit'
class LoopingGradeTests < Test::Unit::TestCase
def self.define_tests
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end
define_tests
end
class CustomAssertionGradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end
def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end
end
class Test::Unit::TestCase
def assert_enumerable_only_returns(expected, enumerable, &block)
messages = enumerable.inject([]) do |result, element|
actual = element.instance_eval(&block)
result << "<#{expected}> expected but was <#{actual}> for #{element}" if expected != actual
result
end
assert_block(messages.join("\n")) { messages.empty? }
end
end
Update 3
Since I ran Saikuro on the code, it only made sense to put it through
Flog also.
The following code was flogged.
class LoopingGradeTests < Test::Unit::TestCase
def self.define_tests
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end
define_tests
end
class CustomAssertionGradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end
def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end
end
The flog score of the looping version was 15.3, the score of the custom assertion version was 6.5.
Both Saikuro and Flog marked the looping test definition with warnings and as a potential problem.