Thursday, March 20, 2008

Testing Anti-Pattern: Metaprogrammed Tests

Update at bottom
Update 2 for Saikuro reported cyclomatic complexity
Update 3 for Flog

I despise metaprogrammed tests. The problem with metaprogrammed tests is that they introduce more questions than answers. Tests are supposed to give confidence, but I don't feel very confident when I find myself asking: which assertion failed? what part of the test is wrong? in which loop, at what value, do you think the problem is?

Let's jump straight to an example. The following method on Fixnum will tell you what the letter grade is.

class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end

50.as_letter_grade # => "F"
60.as_letter_grade # => "D"
70.as_letter_grade # => "C"
80.as_letter_grade # => "B"
90.as_letter_grade # => "A"

For completeness you may wish to test every value between 0 and 100 to ensure that no mistakes are made. Doing this the most straight forward way possible, you would define 101 tests and test every value individually.

require 'test/unit'

class GradeTests < Test::Unit::TestCase
def test_zero_is_an_f
assert_equal "F", 0.as_letter_grade
end

def test_one_is_an_f
assert_equal "F", 1.as_letter_grade
end

def test_two_is_an_f
assert_equal "F", 2.as_letter_grade
end
end

While this would work it suffers from a few complications: it's too long to digest and it would be painfully tedious to write. You might jump to the conclusion that you ought to metaprogram the tests to resolve the previously mentioned issues.

require 'test/unit'

class GradeTests < Test::Unit::TestCase
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end

define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end

This solution isn't so bad at first glance. When a test fails, I can see what number I was working with, what letter I expected and what letter I actually got.

Loaded suite /Users/jay/Desktop/foo
Started
..........................................................
..........F................................
Finished in 0.024512 seconds.

1) Failure:
test_70_is_C:32
<"C"> expected but was
<"D">.

101 tests, 101 assertions, 1 failures, 0 errors
Then I have to actually figure out what is wrong, and this is where I begin to really dislike metaprogrammed tests. The line number is almost worthless. Yes, the loop is on or near that line, but the actual failure isn't found exclusively on that line, it also contains about 100 successful assertions. Also, I always expect the problem to be in the class, but that's not always the case. Metaprogramming in tests is just as susceptable to mistakes as programming the domain. Yet, by instinct we always look there last, because we expect our tests to give us confidence, they should be correct. The example code is easy enough to follow, but most metaprogrammed tests contain more complexity, thus leading to even more fragile and fear instilling tests.

Also, if you find yourself wanting to defend metaprogrammed tests, ask yourself if you usually even provide as many clues as I have. Do you create test names that help you figure out what the problem was? Do you first get the letter and then compare it, or do you assert true and false, yielding even less information. If you don't give me at least as much information as I've given myself in my example, I can't even begin to imagine trying to find out what's wrong with a broken test.

The single largest problem with metaprogrammed tests is that they've unnecessarily added complexity to your test suite. This complexity reduces the maintainability of tests, ensuring that they are less likely to be maintained.

There is a better way.

You can approach the problem differently and still provide a concise solution. Looking at our issue another way, we simply want to test that certain values return A, B, C, D, or F. To me, that appears like I need 5 different tests, not 101. Here's what I consider to be a more maintainable solution.

class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_equal ["A"], (90..100).collect { |int| int.as_letter_grade }.uniq
end

def test_numbers_that_are_Bs
assert_equal ["B"], (80..89).collect { |int| int.as_letter_grade }.uniq
end

# ... test the other letters
end

The above tests should be readable to anyone very quickly. They correctly provide the line number of a failing test when a test fails. Also, each test verifies only one piece of logic, greatly reducing complexity. Lastly, I can easily see in the test that it's written correctly, so any errors must be resulting from a mistake in the domain.

These tests instill more confidence and they are easier to digest and therefore maintain. These are tests that are more likely to live on and provide value. These are tests I thank my teammates for.

Update
Tammer Saleh correctly points out that the failure message for my last example would actually be worse than the failure message from the metaprogrammed tests. I was aware of that fact when I wrote up the entry, but I was unsure how to address the issue. If I were on a project I would write a custom assertion for expectations that would give me a descriptive error message while also allowing me to easily test what I want. That custom assertion would be well tested and could be designed to be general enough to apply across my entire test suite, thus infinitely more valuable than metaprogramming that only solves a problem for a specific test.

But, this isn't a project, it's an example. Still, I failed, I didn't give the complete answer. This is my attempt to resolve that situation. As I said, on a project I would use expectations, but for the purpose of this entry, I'll provide a custom assertion that could be easily used with test/unit.

The general solution is that I have an enumerable object and I want to verify the result of calling a method on each element of the enumerable. Thus, I should be able to create a general assertion that takes my expected single result, the enumerable, and the block that should be executed on each element. If all elements return actual results that match the expected value then the test passes. However, if any element does not return the expected value, then the expected value, the actual value, and the element are all described in the error message. The error message will contain all failures, not just the first one that fails.

Below is the code in full, but the following code would not be enough if this were a real project. Instead, if this were a real project this custom assertion should be tested with the same amount of effort that you put into testing any domain concept.

class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end

require 'test/unit'

class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end

def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end

# ... test the other letters
end

class Test::Unit::TestCase
def assert_enumerable_only_returns(expected, enumerable, &block)
messages = enumerable.inject([]) do |result, element|
actual = element.instance_eval(&block)
result << "<#{expected}> expected but was <#{actual}> for #{element}" if expected != actual
result
end
assert_block(messages.join("\n")) { messages.empty? }
end
end

Additionally, here's the results from a failing test.

class GradeTests < Test::Unit::TestCase
def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 78..89) { |int| int.as_letter_grade }
end
end

# >> Loaded suite -
# >> Started
# >> F
# >> Finished in 0.00063 seconds.
# >>
# >> 1) Failure:
# >> test_numbers_that_are_Bs(GradeTests)
# >> [-:28:in `assert_enumerable_only_returns'
# >> -:17:in `test_numbers_that_are_Bs']:
# >> <B> expected but was <C> for 78
# >> <B> expected but was <C> for 79
# >>
# >> 1 tests, 1 assertions, 1 failures, 0 errors

I would take this solution over any metaprogrammed solution I can think of.

Update 2
I decided to check out what the cyclomatic complexity would look like for defining tests in a loop compared to traditional definitions with custom assertions. I used Saikuro to give me cyclomatic complexity results.

Interestingly, the complexity of the looping test definition (8) is more than the complexity of the logic added to Fixnum (6). It's also double the complexity of the custom assertion version (4) of the tests. The custom assertion also registers a score of 4, but that doesn't concern me since I'll test the custom assertion.

For those interested in running the experiment the code I used can be found below. I defined a class method and called it explicitly because Saikuro reports complexity on a method basis, so I needed a method for it measure.

class Fixnum
def as_letter_grade
case self
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end
end
end

require 'test/unit'

class LoopingGradeTests < Test::Unit::TestCase
def self.define_tests
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end

define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end
define_tests
end

class CustomAssertionGradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end

def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end
end

class Test::Unit::TestCase
def assert_enumerable_only_returns(expected, enumerable, &block)
messages = enumerable.inject([]) do |result, element|
actual = element.instance_eval(&block)
result << "<#{expected}> expected but was <#{actual}> for #{element}" if expected != actual
result
end
assert_block(messages.join("\n")) { messages.empty? }
end
end


Update 3
Since I ran Saikuro on the code, it only made sense to put it through Flog also.

The following code was flogged.

class LoopingGradeTests < Test::Unit::TestCase
def self.define_tests
(0..100).each do |index|
letter = case index
when 0..59 then "F"
when 60..69 then "D"
when 70..79 then "C"
when 80..89 then "B"
when 90..100 then "A"
end

define_method "test_#{index}_is_#{letter}" do
assert_equal letter, index.as_letter_grade
end
end
end
define_tests
end

class CustomAssertionGradeTests < Test::Unit::TestCase
def test_numbers_that_are_As
assert_enumerable_only_returns("A", 90..100) { |int| int.as_letter_grade }
end

def test_numbers_that_are_Bs
assert_enumerable_only_returns("B", 80..89) { |int| int.as_letter_grade }
end
end


The flog score of the looping version was 15.3, the score of the custom assertion version was 6.5.

Both Saikuro and Flog marked the looping test definition with warnings and as a potential problem.

26 comments:

  1. Anonymous3:56 PM

    There seems to be some redundency there.

    assert.equal ["A"], (90..100).grade
    assert.equal ["B"], (80..89).grade

    ...

    ReplyDelete
  2. Anonymous5:27 PM

    class GradeTests < Test::Unit::TestCase
    # ... other previous tests

    def test_all_grades
    assert_equal ["A"], (90..100).grade
    assert_equal ["B"], (80..89).grade
    # ... test the other letters
    end
    end

    class Range
    def grade
    self.collect { |int|
    int.as_letter_grade }.uniq
    end
    end

    ReplyDelete
  3. Anonymous8:48 PM

    When the last version of those tests fail, you get even less information than the metaprogramming version:

    ["A", "X"] is not equal to ["A"]

    You don't know what number caused the failure, or how many times the failure occurs.

    "The single largest problem with metaprogrammed tests is that they've unnecessarily added complexity to your test suite. This complexity reduces the maintainability of tests, ensuring that they are less likely to be maintained."

    I've worked on projects that had 2000 line test files. No one wanted to touch these tests, or the code that they tested. Using a single "metaprogramming" loop, were reduced one of these files to 96 lines.

    The term "metaprogramming" is almost misleading in this context. The examples here are simple loops in the class level - something every ruby programmer should be comfortable reading.

    ReplyDelete
  4. Anonymous9:12 PM

    "When the last version of those tests fail, you get even less information"

    Agreed. I wouldn't have used that test either, but it was the next logical example. I would use expectations.rubyforge.org instead, but I think if I go down that road with the example it will take away from the point.

    "I've worked on projects that had 2000 line test files. No one wanted to touch these tests, or the code that they tested."

    I'm pretty sure several of us have been in this situation. I've also been on projects where the 2000 line test turned into a 96 line test that everyone was afraid of.

    The best solution is to create tests that are as simple as possible. Tests that are complex may also in turn need to be tested. This is not a good scenario. In fact, I wonder what the cyclomatic complexity on a metaprogrammed test would come out to.

    'The term "metaprogramming" is almost misleading in this context.'

    I disagree, writing code that defines more code is pretty much exactly the definition of metaprogramming.

    I think 96 lines of metaprogrammed tests are better than 2000 lines of tests, but I think 96 lines of straightforward tests are even better.

    Thanks for the comment, but I simply don't agree.

    Cheers, Jay

    ReplyDelete
  5. Anonymous10:29 PM

    Tammer,
    Thanks again for the comment. I've updated the article to show the next logical step in creating more maintainable tests.

    Cheers, Jay

    ReplyDelete
  6. Metaprogramming + readable stacktrace = generate the test, write it to file, them require it for execution. Problem solved?

    ReplyDelete
  7. I'm extremely wary of writing similar-looking logic in the test and in the code -- because if I'm likely to screw one up, I'm likely to hose the other as well. Of the two examples you gave, I prefer the latter for the simple reason than that the code under test uses a case statement, and the test does not.

    That being said, in this case I might actually prefer writing a 100-line test method. I'm *much* more tolerant of duplication in tests than in code under test -- especially in a case like this where each test really is one line, and each line can be aligned neatly with the ones above and below it (column edit mode in TextMate is a wonderful thing).

    ReplyDelete
  8. Anonymous8:10 AM

    Evan, assuming the generated tests aren't so many that they become unmaintainable, then yes, problem solved. However, if you generate 2000 tests that I can't easily digest, then I think I'd still have a problem.

    Sam, I completely agree. Like I do with all my tests, I'd look for the solution that gave me the most reliable, readable, and maintainable solution possible. Sometimes that means 100 similar tests, sometimes it means custom assertions.

    Cheers, Jay

    ReplyDelete
  9. Anonymous9:29 AM

    I'm confused. If you name your test correctly, and use good messages, how is the metaprogrammed version different from the "helper" version? You've given the helper the benefit of nicely-written messages, but not the metaprogrammed version. I can understand not liking them, but purposely writing them poorly to attack them is disingenuous.

    Look at the information you get from the failed assertion from the metaprogrammed test. You know you were testing 70 and you expected it to be C, but it was D.

    Now look at the information from the helper-ized test. You know you were testing 78 and you expected it to be a B, but it was a C.

    They both give you the same information (and would be equally well-presented if you bothered to write the test name and message well). The difference is that the helper has way more (and IMHO harder to read) code involved. For what?

    You say the line number is worthless. If it's worthless for a metaprogrammed test, it's worthless for any test, as it tells you what assertion failed, and that's all it's ever told you. The message supplied by the test should tell you why it failed and the name of the test method should tell you what it was supposed to do.

    You say you have to figure out what's wrong. I ask what is so magical about the helper that makes it so obvious what's wrong. They're both giving you the same information about the state of the problem.

    You say that metaprogramming is just as susceptible to mistake as the code it's testing. This is, of course, true. But so is your helper. So are you basic everyday tests.

    Now, don't get me wrong, I'm as crazy about the maintainability and readability of tests as the next guy. It's of vital importance that tests be useful to everyone and so choosing the right tool for the job is paramount. But seeing someone discount a powerful technique out of hand is disheartening, especially when it looks like it's being discounted based on flaws assumptions and implementation.

    ReplyDelete
  10. Anonymous9:58 AM

    Hi jyurek,

    I could pass in a message to assert_equal that was nicer than the default one, but that would add even more code, which makes the test that much more painful to digest. Sure, it looks obvious in the example, but it's one more thing to understand when you come to a test and don't have the benefit of context. If I were to metaprogram the tests, I wouldn't create a nice message, I'd name it nicely, as the example shows.

    But, I agree with you, the test name and default message are enough to know which number failed, what was expected and what was actually returned. It's not the lack of information that is problematic with metaprogrammed tests.

    Line numbers are not worthless for any test. In fact they are valuable for all tests, but they are considerably less valuable when they point to a loop and I need to mentally construct the state of all the variables given the current loop. Again, this isn't so hard in the example, but it's a pain when you come to a loop where you don't have context. Messages and the names are enough to figure out what the problem is, but clicking a link that takes me to a line where a failing test exists is much better. Metaprogrammed tests give me the the link also, but that link is to a failing test and an undetermined number of passing tests. This is unquestionably more complicated.

    "You say you have to figure out what's wrong. I ask what is so magical about the helper that makes it so obvious what's wrong."

    Custom assertions are "magical" because they are tested. They instill confidence because they are proven to work as expected. The same is not true of metaprogrammed tests. Unless you are testing that your metaprogramming is correct via additional tests, then they do not provide the same level of confidence. Because they are tested they allow me to focus on finding the problem in the domain, not wonder if the problem is in the domain or the test itself.

    "You say that metaprogramming is just as susceptible to mistake as the code it's testing. This is, of course, true. But so is your helper."

    Absolutely, which is why my helper is tested, but metaprogrammed tests are not. Untested complexity is always a bad thing.

    There's no assumptions or missing implementation. I'm happy to hear of what you'd consider to be a better implementation, but I'm skeptical that you'll be able to create one that gives me more confidence than a well tested custom assertion.

    Of course, we never even touched on the fact that a well tested custom assertion not only provides more confidence, but can be reused anywhere within a test suite, while a metaprogrammed solution is tied strictly to the tests that it defines. So, with no context for the currently failing test, which do you think is easier to understand, a custom assertion used multiple times within your test suite that you may or may not have seen before, or a one time only loop defining a group of metaprogrammed tests? Even if you haven't seen the custom assertion before, if it's used in various places in the codebase, chances are your teammates will be intimately familiar with it.

    Sure, the answer assumes that the custom assertion is used multiple times in a codebase, but that's part of what designing a custom assertion is all about. Create something general that can be tested and resused for benefit in several places.

    ReplyDelete
  11. Anonymous9:59 AM

    I have to say that I find the helper version the hardest of the three to understand.

    ReplyDelete
  12. Anonymous10:14 AM

    Hi Tammer,
    That's the great thing about custom assertions, you can define your own that you understand or you can use one you don't understand with confidence that it works since it's been tested.

    Ignoring the implementation of assert_enumerable_only_returns, if I told you it was a method that took an expected value, an enumerable and a block and it executed the block once for each element of the enumerable and expected the return value to equal the expected value, I think you'd have no problem understanding how to use it. Since I wouldn't have a custom assertion without testing it well, you wouldn't need to understand the implementation, you could simply use it confidently, just as you do with assert_equal and friends.

    Or, if you wanted to dig into the guts of test/unit like I did, you could create a custom assertion with a better name, that takes better parameters, or whatever is more expressive to you. Then I could use that assertion without needing to understand the implementation.

    Either way, we'd both benefit from gaining functionality without needing to understand the implementation of the custom assertion or a group of metaprogrammed tests.

    ReplyDelete
  13. Anonymous11:51 AM

    "Also, if you find yourself wanting to defend metaprogrammed tests, ask yourself if you usually even provide as many clues as I have."

    More, actually.

    I think it's a bit extreme to call metaprogrammed tests an anti-pattern. I think you've gotten one principle correct, which is that tests should be understandable and maintainable by those who need to read them, but the jump to the title of this post is a non sequitur, IMO.

    Some instances of metaprogramming are more opaque than others, and some teams/developers are more adept with various programming constructs than others. For any non-trivial construct, I'd guess you can find some developers somewhere who find it to be difficult to comprehend.

    As with most such issues, the middle way is best. Understand your team -- which includes anyone who might need to read and understand the code (tests or otherwise) throughout the lifetime of the codebase. Strive to improve quality, readability, maintainability, etc. in that context.

    Sometimes that means using metaprogramming and sometimes it doesn't. I have to confess that in this particular post, I didn't see anything at all about the metaprogramming that would be considered non-trivial by MY current development team, although it couldn't certainly be viewed differently by other teams.

    ReplyDelete
  14. Anonymous11:54 AM

    "[...] although it couldn't certainly be viewed differently by other teams."

    Of course I meant "could certainly be viewed differently by other teams."

    ReplyDelete
  15. Anonymous12:22 PM

    Hi jeffmo,
    I think I have to agree, the title does need to be changed. I shouldn't say metaprogrammed at all, metaprogramming isn't the problem. Next time I talk about this it will be under the title: Tests defined in loops. And, yes, I will still call it an anti-pattern.

    I think your comment is well thought out and I appreciate it, but I think you missed two points.

    It's not a matter of whether or not I can understand the loop, it's whether or not I should need to understand the loop. If you define a group of tests in a loop, you've given me no choice but to dissect the test and figure out what the problem is. Conversely, if you give me a custom assertion that I know the intention of, I don't need to care what the implementation is, I can get straight to fixing the domain issue, that makes me more efficient.

    Also, even in the example where the loop and case statement are entirely trivial, it's possible to make a simple mistake. Even worse, it's possible to make the same mistake in both case statements and get a false positive (as Sam pointed out), but I digress. The point is, a loop, especially one where the values change, is much more likely to contain a mistake than a simple test that utilizes a custom assertion. Since I cannot easily test (nor would I want to test) the loop and variable changes, it's significantly better to use a custom assertion which can and should be thoroughly tested.

    Yes, I want maintainable and understandable tests, but understandable doesn't mean I need to see the implementation. If that were the case we would define everything in one long method. Often abstracting the complexity goes a long way towards readability. In this case, I abstract the complexity to a custom assertion and offer my teammates a assertion where they need to know the intention, but they don't need to know the implementation to be effective, thus increasing readability.

    Also, readable and maintainable aren't enough. Reliable is just as important, if not more. Of course, that's why I have tests for custom assertions.

    ReplyDelete
  16. Anonymous3:59 PM

    I suppose I don't see a dichotomy of picking either custom assertions OR looping/metaprogramming.

    (In fact, we use all of these techniques quite a bit, on my team.)

    Perhaps I see the difference between your team and my team illustrated in this quote:

    "In this case, I abstract the complexity to a custom assertion and offer my teammates a assertion where they need to know the intention, but they don't need to know the implementation to be effective, thus increasing readability."

    I wouldn't be able to assert that my teammates won't need to know the implementation. I would only assert that they won't need to know the implementation unless or until there is a problem in the custom assertion (or in the looping/metaprogramming, for that matter). Then, anyone on the team would be welcome (and expected) to address the issue.

    When making such a decision (between looping or local metaprogramming in a test vs. a custom assertion), I look at factors like:

    How difficult would each approach be to implement correctly?

    How difficult would each approach be for others to read and comprehend?

    Am I expecting re-use of the logic (leans toward custom assertion, even if that might take a little longer up-front) or is this truly a one-off bit of logic (leans toward looping or metaprogramming)?

    (There are certainly other factors, but I guess I'll save those for my own blog posts. :-)

    "Also, readable and maintainable aren't enough."

    Agreed. That's precisely why I wrote "quality, readability, maintainability, etc."

    Thanks for the comments, btw, and the original post. It's always nice to see how other smart devs are thinking about an issue.

    ReplyDelete
  17. Anonymous4:32 PM

    Hi Jeffmo,

    It's not a choice I make on a team by team basis, I would always write a simple test, or a few simple duplicated tests, or a custom assertion, which ever was the best fit, regardless of the team.

    I think the difference between us is that you don't mind having to understand a more complicated test because you can, and you don't mind writing a more complicated test because you expect your teammates to be able to understand it.

    I also expect my teammates to be able to understand it, I just don't want them to have to. I want them to focus on their next card, or another bug in the domain. I don't need them worrying about how I implemented the custom assertion, I want them to effortlessly see what the test is testing and then fix the domain. I also don't ever want them to find a test that is written incorrectly. False positives might be one of my single biggest dislikes. Testing my custom assertions guards against that.

    Sorry for the misquote, I totally missed your reference to quality.

    I also like your thought process. You've clearly put a lot of effort into considering the problem. I expect it wont be long before you completely agree with me. ;)

    And, thanks for your comments as well. They will help write it even better next time I put this content somewhere else.

    Cheers, Jay

    ReplyDelete
  18. Anonymous4:53 PM

    "I think the difference between us is that you don't mind having to understand a more complicated test because you can, and you don't mind writing a more complicated test because you expect your teammates to be able to understand it."

    Almost, but not quite.

    Some situations are simply not complicated, so I wouldn't add complexity for no reason.

    Some are complicated, and I can abstract away some of the complexity in multiple different ways.

    Looping over define_method calls is one of those ways, and yes, I would expect my team members to be able to read that kind of code.

    Writing a reusable custom assertion and calling it multiple times is another way. (I would also expect my team members to be able to read that kind of code.)

    From my POV, it seems that where we differ is that you see no (very few?) cases where looping in tests is useful. I'll give it some more thought; if I can give you a more concrete example (that I'm not restricted from sharing in a public forum) of where I find your disfavored approach to be superior, I may be back!

    Incidentally, I don't know if anyone else has experienced this, but my browser seems to have some struggles rendering SOME of the underscores in your code samples. I haven't spent a lot of time tracking that down, but it is repeatable. At the moment, I'm reading in FF/Win 2.0.0.12, and I don't see the same problem in Firefox running under Ubuntu.

    ReplyDelete
  19. Anonymous7:54 PM

    I understand many of your objections to metaprogrammed tests... still, I've encountered a couple of situations where they were a real productivity booster.

    In one case we were generating a whole mess of C++ classes from XML descriptions. The C++ code was compiled into a daemon process that we communicated with over XMLRPC. So I created some metaprogrammed tests from the same XML descriptions and we were up and testing within a matter of hours instead of days. Yes, there was always the risk that we wouldn't catch something that was defined incorrectly in the original XML description, but we had other checks for that. We were, however, very easily able to test that our C++ code generation was working correctly and that we were able to communicate over the XMLRPC channel. It was very valuable having this test as we made changes our C++ generation code.

    ReplyDelete
  20. Jay,
    You don't need to maintain the test: just the test generator. The test is just an artifact to let you diagnose the problems. For cases like this range-based construct that you're using, it should be trivial to generate methods that perform individual tests.

    ReplyDelete
  21. Anonymous10:23 AM

    Evan,
    I've never gone that road before. Thanks for the thought, I'll give it try in the future and let you know how it works out for me.
    Cheers, Jay

    ReplyDelete
  22. I know this is a bit off topic, but do you really think it is good practice to add the as_letter_grade method to the Fixnum class? Just because in ruby you CAN, doesnt mean you SHOULD.

    IMHO this is a really bad practice - the implication is that ALL integers are grades, which is clearly untrue. It also implies that you can do all the mathematical operations on grades that you can do on other integers - e.g. a grade F (45) + a grade F (45) = a grade A (90). What happens when the integer has a negative value, or a value > 100?

    It would be much better to either have a stand-alone function to do the conversion, or better still a dedicated Grade class, that only lets you do grade-related operations on it.

    ReplyDelete
  23. Anonymous9:14 AM

    Dave,
    I'd say every language I've ever worked with give me plenty of things I can do, but shouldn't. I just want to point out that ruby isn't special in any way when it comes to making bad technical decisions.

    As far as putting behavior on Fixnum, it would depend on the context. If I were writing a script that read an excel spreadsheet and converted all the numeric grades to alphabetic grades, I would absolutely put the behavior on Fixnum. If I were writing a more complicated application I would have to decide where the behavior fit.

    I think it's rarely the case that something is universally a good or bad idea. Even the topic of this entry, I consider it to be an anti-pattern, but that doesn't mean there isn't a case where it makes the most sense.

    Thanks for the comment.

    Cheers, Jay

    ReplyDelete
  24. "As far as putting behavior on Fixnum, it would depend on the context. If Iwere writing a script that read an excel spreadsheet and converted all thenumeric grades to alphabetic grades, I would absolutely put the behavior onFixnum. If I were writing a more complicated application I would have todecide where the behavior fit."

    Even for your excel script I don't see any advantage to adding a method to Fixnum (or any other class) over creating a stand-alone function that takes a number and turns it into a grade. For a start your method will break if your excel sheet has a grade score of 73.5 - not all grade scores are integers, any more than all integers are grade scores.

    I think that many Ruby programmers suffer from the Java/C# mindset of "all functionality has to be implemented as a method on a class". Some things are better expressed as pure functions.

    ReplyDelete
  25. Anonymous11:55 AM

    Dave,
    I think negative numbers, numbers with decimals and numbers greater than 100 are all straw man arguments since those cases will need to be handled whether you add them to an object or create a function. They are purposefully left out of the example because they don't convey anything important to the current discussion.

    As far as whether it should be on an object or simply be a function, again it depends on the context.

    As you already said, a Grade object might be necessary. If it's not, I would see where I need the letter grade and put it there. Whether that's an object I create, Fixnum, Integer, Numeric or even Object will depend on my need. And, yes, if it made sense, I'd just create a lambda and pass that around for use.

    ReplyDelete
  26. Jay,

    Giles talk at MWRC 2008 covers a similar topic: generating code via monkey-patching for various analytical purposes. I merely suggest using the same thing to support testing -- although to actually use the generated test for execution and reference person.

    Your "test" would actually (1) generate the test, (2) write it to file, and (3) run it.

    ReplyDelete

Note: Only a member of this blog may post a comment.