Monday, January 27, 2014

REPL Driven Development

When I describe my current workflow I use the TLA RDD, which is short for REPL Driven Development. I've been using REPL Driven Development for all of my production work for awhile now, and I find it to be the most effective workflow I've ever used. RDD differs greatly from any workflow I've used in the past, and (despite my belief that it's superior) I've often had trouble concisely describing what makes the workflow so productive. This entry is an attempt to describe what I consider RDD to be, and to demonstrate why I find it the most effective way to work.

RDD Cycle

First, I'd like to address the TLA RDD. I use the term RDD because I'm relying on the REPL to drive my development. More specifically, when I'm developing, I create an s-expression that I believe will solve my problem at hand. Once I'm satisfied with my s-expression, I send that s-expression to the REPL for immediate evaluation. The result of sending an s-expression can either be a value that I manually inspect, or it can be a change to a running application. Either way, I'll look at the result, determine if the problem is solved, and repeat the process of crafting an s-expression, sending it to the REPL, and evaluating the result.

If that isn't clear, hopefully the video below demonstrates what I'm talking about.



If you're unfamiliar with RDD, the previous video might leave you wondering: What's so impressive about RDD? To answer that question, I think it's worth making explicit what the video is: an example of a running application that needs to change, a change taking place, and verification that the application runs as desired. The video demonstrates change and verification; what makes RDD so effective to me is what's missing: (a) restarting the application, (b) running something other than the application to verify behavior, and (c) moving out of the source to execute arbitrary code. Eliminating those 3 steps allows me to focus on what's important, writing and running code that will be executed in production.

Feedback

I've found that, while writing software, getting feedback is the single largest time thief. Specifically, there are two types of feedback that I want to get as quickly as possible: (1) Is my application doing what I believe it is? (2) What does this arbitrary code return when executed? I believe the above video demonstrates how RDD can significantly reduce the time needed to answer both of those questions.

In my career I've spent significant time writing applications in C#, Ruby, & Java. While working in C# and Java, if I wanted to make and verify (in the application) any non-trivial change to an application, I would need to stop the application, rebuild/recompile, & restart the application. I found the slowness of this feedback loop to be unacceptable, and wholeheartedly embraced tools such as NUnit and JUnit.

I've never been as enamored with TDD as some of my peers; regardless, I absolutely endorsed it. The Design aspect of TDD was never that enticing to me, but tests did allow me to get feedback at a significantly superior pace. Tests also provide another benefit while working with C# & Java: They're the poorest man's REPL. Need to execute some arbitrary code? Write a test, that you know you're going to immediately delete, and execute away. Of course, tests have other pros and cons. At this moment I'm limiting my discussion around tests to the context of rapid feedback, but I'll address TDD & RDD later in this entry.

Ruby provided a more effective workflow (technically, Rails provided a more effective workflow). Rails applications I worked on were similar to my RDD experience: I was able to make changes to a running application, refresh a webpage and see the result of the new behavior. Ruby also provided a REPL, but I always ran the REPL external to my editor (I knew of no other option). This workflow was the closest, in terms of efficiency, that I've ever felt to what I have with RDD; however, there are some minor differences that do add up to an inferior experience: (a) having to switch out of a source file to execute arbitrary code is an unnecessary nuisance and (b) refreshing a webpage destroys any client side state that you've built up. I have no idea if Ruby now has editor & repl integration, if it does, then it's likely on par with the experience I have now.

Semantics

  • It's important to distinguish between two meanings of "REPL" - one is a window that you type forms into for immediate evaluation; the other is the process that sits behind it and which you can interact with from not only REPL windows but also from editor windows, debugger windows, the program's user interface, etc.
  • It's important to distinguish between REPL-based development and REPL-driven development:
    • REPL-based development doesn't impose an order on what you do. It can be used with TDD or without TDD. It can be used with top-down, bottom-up, outside-in and inside-out approaches, and mixtures of them.
    • REPL-driven development seems to be about "noodling in the REPL window" and later moving things across to editor buffers (and so source files) as and when you are happy with things. I think it's fair to say that this is REPL-based development using a series of mini-spikes. I think people are using this with a bottom-up approach, but I suspect it can be used with other approaches too.
-- Simon Katz
I like Simon's description, but I don't believe that we need to break things down to two different TLAs. Quite simply, (sadly) I don't think enough people are developing in this way, and the additional specification causes a bit of confusion among people who aren't familiar with RDD. However, Simon's description is so spot on I felt the need to describe why I'm choosing to ignore his classifications.

RDD & TDD

RDD and TDD are not in direct conflict with each other. As Simon notes above, you can do TDD backed by a REPL. Many popular testing frameworks have editor specific libraries that provide immediate feedback through REPL interaction.

When working on a feature, the short term goal is to have it working in the application as fast as possible. Arbitrary execution, live changes, and only writing what you need are 3 things that can help you complete that short term goal as fast as possible. The video above is the best example I have of how you go from a feature request to software that does what you want in the smallest amount of time. In the video, I only leave the buffer to verify that the application works as intended. If the short term goal was the only goal, RDD without writing tests would likely be the solution. However, we all know that that are many other goals in software. Good design is obviously important. If you think tests give you better design, then you should probably mix both TDD & RDD. Preventing regression is also important, and that can be accomplished by writing tests after you have a working feature that you're satisfied with. Regression tests are great for giving confidence that a feature works as intended and will continue to in the future.

REPL Driven Development doesn't need to replace your current workflow, it can also be used to extend your existing TDD workflow.

Tuesday, June 11, 2013

Coding: Increase Your Reading and Writing Speed

A teammate of mine recently expressed a desire for a shortcut for something we type often. I started looking into our shortcut options and came to a common determination: We can do this, but the number of 2 key shortcuts available to us is finite, so we better use them wisely.

I wrote the following unix to give me a rough idea of what we type frequently.
find . -name "*.clj" | xargs cat | tr -s '[:space:]:#()[]{}\"' '\n' | sort | uniq -c | sort -n
note: If you're not writing clojure you'll want to look for something other than .clj files, and you might also want to tweak what you replace with a new line.

The above unix gave me an ordered list of the most typed 'words' across all of my codebases. At this point I had some science for setting up some shortcuts.

Writing

You'll want to look into whatever editor/ide you use and see if you can find key shortcuts and snippet expansion. My editor is emacs; I assigned some key-chords and some yasnippets. If you're not using emacs you should have something similar in whatever you are using.

While I wanted to define some shortcuts, I also didn't want to create so many that I was constantly wasting time looking up what I'd created. Based on that desire I created:
  • 2 shortcuts (key-chords) for two of the most duplicated words. The shortcuts are concise by design, but that makes them a bit harder to remember. You can probably get started with more than 2, but I didn't see much harm in starting there.
  • a dozen snippets for the next most used words. These snippets are descriptive enough to easily remember, thus I felt comfortable defining several of them. e.g. pps expands to (println (pr-str )).
Having shortcuts and snippets will obviously make me more productive, and the unix helped me figure out which words were the most important to optimize for.

Reading

Most editors/ides also give you a summary view for common code patterns. For example, IntelliJ displays lambdas when the actual code is actually an anonymous class. Emacs gives you font-lock for turning patterns into individual characters. Armed with my list of the most common words in my codebase, I created font-locks for the 11 most duplicated.

Sometimes a picture is worth a thousand words. Below is a function definition without any font locks applied.


The following image is what the same function looks like with custom font locks applied. It might be a bit jarring at first, but in the long run it should add up to many small victories such as quickly identifying patterns in code and less line breaks.


Results

I've been working with these settings for a little over a week now. I haven't needed to look up anything I defined, and I get a little burst of satisfaction when I read or write something faster than I'd been able to in the past. I'd definitely recommend doing something similar with your codebase and ide/editor.

Thursday, May 16, 2013

Clojure: Combining Calls To Doseq And Let

I've you've ever looked at the docs for clojure's for macro, then you probably know about the :let, :when, and :while modifiers. What you may not know is that those same modifiers are available in doseq.

I was recently working with some code that had the following form.


Upon seeing this code, John Hume asked if I preferred it to a single doseq with multiple bindings. He sent over an example that looked similar to the following example.


That was actually the first time that I'd seen multiple bindings in a doseq, and my immediate reaction was that I preferred the explicit simplicity of having multiple doseqs. However, I always have a preference for concise code, and I forced myself to starting using multiple bindings instead of multiple doseqs - and, unsurprisingly, I now prefer multiple bindings to multiple doseqs.

You might have noticed that the second version of the code slightly changes what's actually being done. In the original version the 'name' function is called once per 'id', and in the second version the 'name' function is called once per 'sub-id'. Calling name significantly more often isn't likely to have much impact on your program; however, if you were calling a more expensive function this change could have a negative impact. Luckily, (as I previously mentioned) doseq also provides support for :let.

The second example can be evolved to the following code - which also demonstrates that the let is only evaluated once per iteration.


That's really the final version of the original code, but you can alter it slightly for experimentation purposes if you'd like. Let's assume we have another function we're calling in an additional let and it's expensive, it would be nice if that only occurred when an iteration was going to happen. It turns out, that's exactly what happens.


Whether you prefer multiple bindings or multiple doseqs, it's probably a good idea to get comfortable reading both.

Wednesday, May 15, 2013

Emacs Lisp: Font Lock for Clojure's Partial

I love using partial, but I dislike the length of the function name. There's a simple solution, define another function with a shorter name that simply calls (or is) partial. This is exactly what I did in the jry library.

I liked the use of % due to partial feeling similar to creating a function using #(), and % having a special meaning inside #(). I thought they tied well together. Unfortunately, there's an obvious problem, things would be very broken if you tried to use the '%' function in an anonymous function defined with #(). Somewhere along the way this issue caused me to stop using jry/%.

Using partial is great: it's part of the standard lib, and I don't need to explain it to anyone who joins my team or any future maintainers of the code I write. Still, I want something shorter, and I've always had a background thread looking for another shorter-than-partial solution. While recently contributing to emacs-live I found the solution I was looking for: clojure-mode font lock.

The following code can now be found in my emacs configuration.

This solution feels like the best of both worlds. My code still uses the function from the standard library, my colleagues still see a function they already know, and 'partial' only takes up one character space in my buffer. The image below is what you'll see if you put the above emacs-lisp in your config.

Tuesday, May 14, 2013

Clojure: Testing The Creation Of A Partial Function

I recently refactored some code that takes longs from two different sources to compute one value. The code originally stored the longs and called a function when all of the data arrived. The refactored version partials the data while it's incomplete and executes the partial'd function when all of the data is available. Below is a contrived example of what I'm taking about.

Let's pretend we need a function that will allow us to check whether or not another drink would make us legally drunk in New York City.

The code below stores the current bac and uses the value when legally-drunk? is called.

The following (passing) tests demonstrate that everything works as expected.

This code works without issue, but can also be refactored to store a partial'd function instead of the bac value. Why you would want to do such a thing is outside of the scope of this post, so we'll just assume this is a good refactoring. The code below no longer stores the bac value, and instead stores the pure-legally-drunk? function partial'd with the bac value.

Two of the three of the tests don't change; however, the test that was verifying the state is now broken.

note: The test output has been trimmed and reformatted to avoid horizontal scrolling.

In the output you can see that the test is failing as you'd expect, due to the change in what we're storing. What's broken is obvious, but there's not an obvious solution. Assuming you still want this state based test, how do you verify that you've partial'd the right function with the right value?

The solution is simple, but a bit tricky. As long as you don't find the redef too magical, the following solution allows you to easily verify the function that's being partial'd as well as the arguments.

Those tests all pass, and should provide security that the legally-drunk? and update-bac functions are sufficiently tested. The pure-legally-drunk? function still needs to be tested, but that should be easy since it's a pure function.

Would you want this kind of test? I think that becomes a matter of context and personal preference. Given the various paths through the code the following tests should provide complete coverage.

The above tests make no assumptions about the implementation - they actually pass whether you :use the 'original namespace or the 'refactored namespace. Conversely, the following tests verify each function in isolation and a few of them are very much tied to the implementation.

Both sets of tests would give me confidence that the code works as expected, so choosing which tests to use would become a matter of maintenance cost. I don't think there's anything special about these examples; I think they offer the traditional trade-offs between higher and lower level tests. A specific trade-off that stands out to me is identifying defect localization versus having to update the test when you update the code.

As I mentioned previously, the high-level-expectations work for both the 'original and the 'refactored namespaces. Being able to change the implementation without having to change the test is obviously an advantage of the high level tests. However, when things go wrong, the lower level tests provide better feedback for targeting the issue.

The following code is exactly the same as the code in refactored.clj, except it has a 1 character typo. (it's not necessary to spot the typo, the test output below will show you want it is)

The high level tests give us the following feedback.
failure in (high_level_expectations.clj:14) : expectations.high-level-expectations
(expect
 true
 (with-redefs
  [state (atom {})]
  (update-bac 0.01)
  (legally-drunk? 0.07)))

           expected: true 
                was: false
There's not much in that failure report to point us in the right direction. The unit-level-expectations provide significantly more information, and the details that should make it immediately obvious where the typo is.
failure in (unit_level_expectations.clj:8) : expectations.unit-level-expectations
(expect
 {:legally-drunk?* [pure-legally-drunk? 0.04]}
 (with-redefs [state (atom {}) partial vector] (update-bac 0.04)))

           expected: {:legally-drunk?* [# 0.04]} 
                was: {:legally-drunk?** [# 0.04]}
 
           :legally-drunk?** with val [# 0.04] 
                             is in actual, but not in expected
           :legally-drunk?* with val [# 0.04] 
                            is in expected, but not in actual
The above output points us directly to the extra asterisk in update-bac that caused the failure.

Still, I couldn't honestly tell you which of the above tests that I prefer. This specific example provides a situation where I think you could convincingly argue for either set of tests. However, as the code evolved I would likely choose one path or the other based on:
  • how much 'setup' is required for always using high-level tests?
  • how hard is it to guarantee integration using primarily unit-level tests?
In our examples the high level tests require redef'ing one bit of state. If that grew to a few pieces of state and/or a large increase in the complexity of the state, then I may be forced to move towards more unit-level tests. A rule of thumb I use: If a significant amount of the code within a test is setting up the test context, there's probably a smaller function and a set of associated tests waiting to be extracted.

By definition, the unit-level tests don't test the integration of the various functions. When I'm using unit-level tests, I'll often test the various code paths at the unit level and then have a happy-path high-level test that verifies integration of the various functions. My desire to have more high-level tests increases as the integration complexity increases, and at some point it makes sense to simply convert all of the tests to high-level tests.

If you constantly re-evaluate which tests will be more appropriate and switch when necessary, you'll definitely come out ahead in the long run.