Jay Fields' Thoughts: March 2011

Thursday, March 24, 2011

Readable Clojure Without a Java Equivalent?

I've recently joined a new team and we've been doing a bit of Clojure. If you've never done Lisp (and I hadn't before I found Clojure) it's natural to ask the quesetion: Will programming in Lisp, especially Prefix Notation, ever feel natural? The question came up a few weeks ago and I had two answers

First of all, I've never been really upset about parenthesis. In fact, I've been doing a bit of Java these days and I don't see much difference between verify(foo).bar(eq(100), any(Baz.class), eq("Cat")) and (-> foo verify (.bar (eq 100) (any Baz) (eq "Cat))). By my count it's the same number of parenthesis. Where they're located moves around a bit, but I don't consider that to be a good or a bad thing.

People also like to bring up examples like this:

(apply merge-with +
  (pmap count-lines
    (partition-all *batch-size*
      (line-seq (reader filename)))))

Stefan Tilkov addressed this in a previous blog entry, and (in the comments of Stefan's entry) Rich Hickey points out that you can use a few different versions if you prefer to lay your code out in a different manner. Below are two alternative solutions Rich provided.

; The same code in a pipelined Clojure style:

(->> (line-seq (reader filename))
  (partition-all *batch-size*)
  (pmap count-lines)
  (apply merge-with +))

; and in a step-wise Clojure style with labeled interim results (a la Adrian’s comment):

(let [lines (line-seq (reader filename))
      processed-data (pmap count-lines 
                           (partition-all *batch-size* lines))]
  (apply merge-with + processed-data))

I've felt the same way for awhile. You can write Cobol in any language. The real question is, are you up for learning how to solve problems elegantly and idiomatically in a new language. If you're willing to invest the time, you'll be able to find out for yourself if it feels natural to write idiomatic Clojure code.

That was my answer, until today. While working on some Java code I stumbled on the following Java method.

public void onFill(int fillQty) {
  this.qty = Math.max(this.qty - fillQty, 0);
}

This is a simple Java method that is decrementing the outstanding quantity state of an order by the amount of the order that just been filled. While reading the line I couldn't help but feel like there should be a more elegant way to express the logic. You want to set the outstanding quantity state to the current outstanding quantity minus what's just been filled, but you also never want the outstanding quantity to go below zero. I read from left to right, and I really wanted a way to express this logic in a way that followed the left to right pattern.

In Clojure, this is easy to do:

(swap! qty #(-> % (- fill-qty) (Math/max 0)))

For readers who are less familiar with Clojure's dispatch reader macro, the above example can also be written as:

(swap! qty (fn [current-qty] (-> current-qty (- fill-qty) (Math/max 0))))

In the example above swap! is setting the qty state with the return value of the function.

If you're really new to Clojure, that might still be too much to take, so we can reduce the example and remove the state setting. Here's the version in Java that ignores setting state.

Math.max(this.qty - fillQty, 0);

The example below is a logical equivalent in Clojure.

(-> qty (- fill-qty) (Math/max 0))

When reading the above Java example I'm forced to put Math.max on my mental stack, evaluate this.qty - fillQty, and then mentally evaluate the method I put on the stack with my new result and the additional args. This isn't rocket science, but it's also not how I naturally read (left to right). On the other hand, when I read the Clojure version I think - take the current quantity, subtract the fill quantity, then take the max of that and zero. The code reads in small, logical chucks that are easy for me to digest.

Obviously, the Java code can also be rewritten in a few other ways. Here's an example of Java code that reads left to right and top to bottom.

public void onFill(int fillQty) {
  this.qty -= fillQty
  this.qty = Math.max(this.qty, 0);
}

And, I can do something similar in Clojure, if I want.

(swap! qty #(- % fill-qty))
(swap! qty #(Math/max % 0))

While it's possible to write Clojure similar to the above example, it's much more likely that you'd use a let statement if you wanted to break up the two operations.

(defn update-qty [current fill-qty]
  (let [total-qty (- current fill-qty)]
    (Math/max total-qty 0)))

(swap! qty update-qty fill-qty)

The above example is probably about equivalent to the following Java snippet.

public void onFill(int fillQty) {
  int newTotalQty = this.qty - fillQty
  this.qty = Math.max(newTotalQty, 0);
}

So, I can write code that is similar to my options in Java, but I'm still left wanting a Java version that is similar to this Clojure example:

(swap! qty #(-> % (- fill-qty) (Math/max 0)))

The only thing that springs to mind is some type of fluent interface that allows me to say this.qty = this.qty.minus(fillQty).maxOfIdentityOrZero(), but I can't think of a realistic way to create that API without quite a bit of infrastructure code (including my own Integer class).

(note, you could extend Integer in a language with open classes, but that's outside the scope of this discussion)

The last Clojure example is definitely the version of the code I would prefer. My preference is based on the way the code reads in concise, logical chunks from left to write. I don't have to solve inside out like the original java version forces me to, and I don't have to split my work across two lines.

I'm sure there are situations where Java allowed me to create an elegant solution that wouldn't have been possible in Clojure. This entry isn't designed to send a "Clojure is better than Java" message. I don't believe that. However, before today I've held the opinion that you can write Clojure that logically breaks up problems in ways very similar to what you do using Java. However, I've now also expanded my opinion to include the fact that in certain situations Clojure can also allow me to solve problems in a way that I find superior to my options within Java.

And, yes, after a bit of time getting used to Lisp syntax, it definitely does feel perfectly natural to me when I'm developing using Clojure.

Tuesday, March 15, 2011

Types of Technical Debt

As a developer at DRW, technical debt is often on your mind. Our front office teams work directly with their traders (often sitting directly next to them), and are exposed in real-time to their software needs. While sitting with the traders you see the immediate impact of your recent feature additions. The feedback loop is extremely tight - In the past I've come in to work, received a feature request early in the trading session, coded the change, deployed it to production, and seen the profits as part of that same trading day. In that kind of environment, you tend to care more about adding features than you do about having the most elegantly designed software in the history of the world.

But, you can't ignore code quality due to it's impact on delivery speed. There's a constant internal personal struggle between delivering features and keeping quality at a level that enables high speed delivery. Our technical debt is very visible, we keep reminders around to ensure we address our technical debt in the future. We also use those same reminders to prioritize which technical debt payoff will most benefit the team. The traders are also very aware of the technical debt, and it's often the case that they ask the question: Do you guys need to spend some time paying down this technical debt?

Working in an environment where the traders understand technical debt and embrace it's payoff is great, but it exposes other issues with technical debt.

Steve Freeman's blog entry, Bad code isn’t Technical Debt, it’s an unhedged Call Option, first got me thinking about different types of technical debt. That blog entry rang very true for me. I really liked the idea that selling a call option was a better metaphor; however, I wondered if it was something that could easily work in your average workplace.

I happen to work in a trading firm, so my customers know the implications of selling a call option. But, the entire idea of a metaphor is to understand one thing in terms of another. If you don't know either things, the metaphor doesn't really work. I don't think the average customer already understands options; therefore, I don't think the new metaphor will ever catch on.

But, I did believe it was a superior metaphor. So, I started looking at my bad code as selling calls instead of debt. In some cases the metaphor worked perfectly. We created an analysis tool that was mildly important to the traders and didn't have strict reliability needs. This analysis tool didn't have a single test. If the tool changed it's role, we would have had to put in a fair bit of work to ensure that it was more bug resilient. However, the tool served it's purpose and at this point I don't believe it's used by anyone. We saved a bunch of time (collected the premium) without ever having to make significant changes to the code (deliver).

However, in other cases the 'selling calls' metaphor didn't fit as well. In general, developers tend to call anything that slows them down 'techincal debt'; therefore, a slow running test suite would be considered technical debt. If everything that used to be called technical debt is now selling a call, then a slow running test suite is also selling a call. Except, that doesn't fit. I pay the time it takes to run the test suite several times a day. In the case of a slow running test suite, I think a high interest loan is a better metaphor.

Therefore, if all things that slow a developer down are 'technical debt' then you have some debt where interest is paid on an ongoing basis (slow running tests) and other debts where you'll only pay under specific circumstances (extending an application with no tests). I think this is a concept that is understood intuitively by most developers, but I wonder if adding visual type indicators to the identified technical debt would help with understanding and prioritization.

As a customer, I can imagine I'd appreciate knowing what is costing me on a daily basis and what isn't costing me anything today, but could cripple me tomorrow. And, as a team member I would be interested in highlighting ongoing pain and moving that higher on the priority list (assuming the other risks were reasonable). I imagine team members are already doing this subconsciously, but making it explicit could provide valuable to other members of the development team that have less context on specific situations.

In thinking about different types of technical debt, I wondered if there were other types of technical debt I was overlooking or incorrectly grouping with "high interest loans" or "selling calls". If any ideas spring to mind, please leave a comment.

Sunday, March 13, 2011

Random Thoughts on Good Programmers and Good Code

I've know some really, really smart guys who write code I hate. I love using their libraries, but I don't ever want to edit their source. To me, their source is too complicated. I used to think that they were smarter than I was and I benefited by having to write clearer code, as I couldn't easily digest their complicated code.

But, I've also run into code I've written that is complicated in it's own ways. I like to find the dark corners of languages and exploit them. To me, 8 different classes being configured 10 different ways and used in 3 different situations is something I can never get comfortable with. However, metaprogramming & macros don't scare me at all, and I often (over) embrace them.

And, to me, that's the heart of the issue. Each of us have a comfort level with different styles of development. I like to write as little as possible, and embrace "magic" that allows me to write less. Other people are more comfortable with more code, as long as it follows patterns that are easy to follow through the system. We could decide which is better for "average" programmers, but I don't think there's a "better" approach for good programmers who have tried both and stick with what they prefer; they know to play to their strengths.

This also helps explain why all code we haven't written "sucks" and all code we write is "awesome"

As an example, I think test names are worthless because I don't use them. Dan North loves test names and uses them to help him understand the responsibility of code. It's as simple as a different approach for digesting code, but the result is assigning a value to an artifact. I can write all I want about how test names are worthless, but it's not going to change the fact that they are helpful to Dan. He and I could (ignorantly) argue till the end of time about the value of test names, but the actual discussion is in how we digest code.

These days, the Java, Ruby, Clojure, Patterns, Metaprogramming, Static, Dynamic, etc discussions all feel this way to me. You're either an idiot who doesn't understand what you're arguing against, or you're talented enough that you found what makes you the most productive. Either way, your opinions don't necessarily apply to anyone else.

That said, I think there is value in presenting new ideas (or restating old ideas in new ways) as it can inspire people to reevaluate their assumptions. The ideas that tend to be most helpful for me are the ones presented as experience reports and discussions on the benefits of a particular solution. The ideas that are presented as an X vs Y smackdown always tend to make me wonder if someone is still searching for a silver bullet.

Wednesday, March 02, 2011

Clojure: Eval-ing a String in Clojure

I recently needed to eval a string into Clojure and found it was easy, but it wasn't accomplished in the way I expected. My recent experience with eval was in Ruby, so I naturally reached for Clojure's eval function. I quickly found that Clojure's eval was not what I was looking for, well, not exactly what I was looking for.

(eval "(+ 1 1)")
"(+ 1 1)"

This quick trip to the REPL reminded me of the last time I needed to eval a string, and the previous experiences I'd had with read-string and load-string. It turns out load-string does the trick, and read-string + eval can also get the job done.

user=> (load-string "(+ 1 1)")
2
user=> (eval (read-string "(+ 1 1)"))
2

Looking at the documentation for read-string and load-string can help point you at which you might need for your given situation.

read-string: Reads one object from the string [argument]
load-string: Sequentially read[s] and evaluate[s] the set of forms contained in the string [argument]

Okay, it looks like you'd probably want load-string if you had multiple forms. Back to the REPL.

user=> (load-string "(println 1) (println 2)")                
1
2
user=> (eval (read-string "(println 1) (println 2)"))
1

As you can see, load-string evaluated the entire string and read-string only returned the first form to eval. The last examples show printing values (and I removed the nil return values), but it's more likely that you'll be concerned with returning something from read-string or load-string. The following example shows what's returned by both functions.

user=> (load-string "1 2")                           
2
user=> (eval (read-string "1 2"))
1

Given our previous results, I'd say load-string and read-string are returning the values you'd expect. It seems like it's easy to default to load-string, but that's not necessarily the case if evaluation time is important to you. Here's a quick snippet to get the point across.

user=> (time (dotimes [_ 100] (eval (read-string "1 2"))))
"Elapsed time: 12.277 msecs"
nil
user=> (time (dotimes [_ 100] (load-string "1 2")))       
"Elapsed time: 33.024 msecs"
nil

One of the applications I previously worked on would write a large number of events to a log each day; the format used to log an event was a Clojure map. If the application was restarted the log would be read and each event would be replayed to get the internal state back to current. We originally started by using load-string; however, start-up time was becoming a problem, and a switch to read-string completely removed the issue.

Like so many examples in programming, there's a reason that two versions exist. It's worth taking the time to see which solution best fits your problem.

Tuesday, March 01, 2011

Clojure: if-let and when-let

I'm a fan of if-let and when-let. Both can be helpful in creating succinct, readable Clojure code. From the documentation:

If test is true, evaluates then with binding-form bound to the value of test, if not, yields else

A quick example should demonstrate the behavior.

user=> (if-let [a :anything] a "no")    
:anything
user=> (if-let [a nil] a "no")      
"no"
user=> (if-let [a false] a "no")
"no"

The above example demonstrates a simple case where a truthy value causes the "then" form to be returned, and a falsey value (nil or false) causes the "else" form to be returned. The example also shows that "a" is bound to the value :anything and can be used within the "then".

The following example shows that you can put code in the "then" and "else" statement's as well.

user=> (if-let [a 4] (+ a 4) (+ 10 10)) 
8
user=> (if-let [a nil] (+ a 4) (+ 10 10))
20

Just like above, when "a" is truthy the value is bound and it can be used as desired in the "then".

The when-let function behaves basically the same way, except there is no "else".

user=> (when-let [a 9] (+ a 4))  
13
user=> (when-let [a nil] (+ a 4))
nil
user=> (when-let [a 9] (println a) (+ a 4))  
9
13

The example demonstrates that then "then" case is very similar to if-let; however, if the test fails the when-let function simply returns nil. The last when-let example also shows how it slightly differs from an if-let: you can pass as many forms to the "then" as you'd like. If the test evaluates to true, all of the additional forms will be evaluated.

There are a few things worth knowing about if-let and when-let with respect to the bindings. Both if-let and when-let require a vector for their bindings, and there must be exactly two forms in the binding vector. The following example shows Clojure's response if you choose not to follow those rules.

user=> (when-let (a 9) (println a))
java.lang.IllegalArgumentException: when-let requires a vector for its binding (NO_SOURCE_FILE:0)
user=> (when-let nil (println "hi"))
java.lang.IllegalArgumentException: when-let requires a vector for its binding (NO_SOURCE_FILE:0)
user=> (when-let [a 9 b nil] (println a b))
java.lang.IllegalArgumentException: when-let requires exactly 2 forms in binding vector (NO_SOURCE_FILE:0)

It's nice to know if-let and when-let, but they aren't exactly hard concepts to follow. Once someone points them out to you, I expect you'll be able to easily use them in your code without much effort.

However, what is the impact of destructuring on if-let and when-let? I know what I expected, but I thought it was worth a quick trip to the REPL to make sure my assumptions were correct. The following code shows that as long as the value being bound is truthy the "then" will be executed - destructuring values have their normal behavior and do no effect execution flow.

user=> (when-let [{a :a} {:a 1}] [a])
[1]
user=> (when-let [{a :a} {}] [a])    
[nil]
user=> (when-let [{a :a} {:a false}] [a])
[false]
user=> (when-let [{a :a} nil] [a])       
nil

As you would expect the when-let is consistent and the destructuring behavior is also consistent.

if-let and when-let are good, give them a shot.