Tuesday, July 20, 2010

Clojure: Composing Functions

Before Clojure, I had never used a Functional Programming language. My experience has primarily been in C#, Ruby, & Java. So, learning how to use Clojure effectively has been a fun and eye-opening experience.

I've noticed a few things:
  • The Clojure language has a lot of built in support for maps and hashes (literal syntax, destructuring, etc)
  • Rich Hickey is quoted as saying "It is better to have 100 functions operate on one data abstraction than 10 functions on 10 data structures." Clojure reflects this opinion and has loads of functions that work with Sequences.
  • I strongly dislike working with people who think Map<String, Map<String, String>> is generally a good idea in Java. However, all the support for maps and vectors in Clojure makes working with them easy and efficient.
  • Once you embrace functions, maps, vectors, etc, composing functions together to transform data becomes effortless.
Like I said, I'm still learning Clojure. Recently I refactored some code in a way that I thought was worth documenting for other people who are also learning Clojure.

Let's start with a simple example. Let's assume we want to track the score in the British Open (Golf). The data structure we will be working with will be a map of maps, keyed by country and then player.
{:England {"Paul Casey" -1}}
The above example shows that Paul Casey plays for England and currently has the score of -1. We need a function that allows you to update a players score. Let's also assume that scores for individual holes come in the following map form.
{:player "Paul Casey" :country :England :score -1}
Given the above code, you would expect the following test to pass.
(def current-score {:England {"Paul Casey" -1}})

(deftest update-score-test
(is (=
{:England {"Paul Casey" -2}}
(update-score current-score {:player "Paul Casey" :country :England :score -1}))))
(Let's ignore the fact that I'm def'ing current-score. It's only for simplicity's sake.)

There are a lot of ways to make that test pass. It took me a few tries to come up with something I liked. Below is the path I took.

Step one, get the test passing using the functions I know.
(defn update-score [current update]
(let [player (:player update)
country (:country update)
hole-score (:score update)]
(merge current {country {player (+ ((current country) player) hole-score)}})))
The above example code is a victory when you are learning Clojure. The tests pass, you've used a few features of the language. It's a great first pass. However, I'm sure the Clojure veterans are drooling over the refactoring possibilities.

Step two, learn the update-in function. Like I previously mentioned, Clojure has a lot of functions for working with sequences (maps, vectors, lists, etc). It's not always obvious which function you should be using. However, I've had loads of success learning from the mailing list, and getting immediate answers in the #clojure chat room on freenode. One way or another, I eventually found the update-in function, which is designed up update a nested key.
(defn update-score [current update]
(let [player (:player update)
country (:country update)
hole-score (:score update)]
(update-in current [country player] (fn [_] (+ ((current country) player) hole-score)))))
The switch to update-in is relatively easy and cleans up a bit of code, but now I've introduced an anonymous function. Anonymous functions are great, but if I can get away with not creating them, I always take that route.

However, before we eliminate the anonymous function, we're actually better off making sure we understand the new function (update-in). While the current version of the code works, it's much cleaner if you use the current value that is passed into the anonymous function instead of looking up the value yourself.
(defn update-score [current update]
(update-in current [(:country update) (:player update)] (fn [overall-score] (+ overall-score (:score update)))))
Since we no longer use country and player in multiple places, it seems like removing the let statement is a logical choice. As a result, the amount of code necessary to solve this problem seems to get smaller and smaller.

However, we can actually take this a step farther and use destructuring to provide an even cleaner implementation.
(defn update-score [current {:keys [country player score]}]
(update-in current [country player] (fn [overall-score] (+ overall-score score))))
If you aren't familiar with destructuring the syntax might be a bit confusing. I created a blog entry for destructuring that should help if you aren't familiar with it. There's also a good write up with destructuring examples available on clojure.org. I would also recommend writing a few simple functions and try out the examples for yourself.

There's one more step to take, but for me it was a big one. The documentation for update-in states:
Usage: (update-in m [k & ks] f & args)

'Updates' a value in a nested associative structure, where ks is a sequence of keys and f is a function that will take the old value and any supplied args and return the new value, and returns a new nested structure.
What I currently have works fine, but the additional args that update-in takes allows me to provide a more concise solution that I greatly prefer.
(defn update-score [current {:keys [country player score]}]
(update-in current [country player] + score))
This final version doesn't require any anonymous functions. Instead it relies on on update-in to apply the current value and any additional args to the function that you passed in. The final version is so concise you begin to wonder if you really need to define an update-score function.

The final snippet is an example of why I'm very intrigued by Clojure these days. The ability to compose functions and mix them in with the powerful functions of the language results in very concise code.

I consider what I would need to write to accomplish the same thing in Java or Ruby, and I have to wonder if the Functional Programming supporters might just be on to something.


  1. It was Alan Perlis who made the "...100 functions..." quote.

  2. Anonymous10:26 AM

    Alan wrote the original quote, Rich modified it to include the words "one data abstraction"

    cheers, Jay

  3. Ruby actually already has lots of support for this kind of thing. Python has most of it too, but also has a culture that discourages using it.

    Passing functions in Ruby is quite easy. You'd write "&:+" rather than just "+" to pass plus as the function there (assuming you're using ActiveSupport or Ruby 1.9 for Symbol::to_proc). But an update-in function like that is easy to write in Ruby, and most of the regular function stuff is already in the standard library.

    And you're right, it's very, very powerful.

  4. Not to be a hater, but I wouldn't consider this to be very much Ruby to accomplish the same thing.

    def update_score(current, update)
    current[update[:country]][update[:player]] += update[:score]

    current_score = {:England => {"Paul Casey" => -1}}
    update = {:player => "Paul Casey", :country => :England, :score => -1}

    def test_update_score
    assert_equal {:England => {"Paul Casey" => -2}}, update_score(current_score, update)

  5. Anonymous6:13 PM

    It's been a few years since I've done any Ruby. It seems Java's erased my memory. =)

    I agree, the Ruby version isn't that bad.

    Cheers, Jay

  6. thomas7:29 PM

    I am wondering about this one:
    "I strongly dislike working with people who think Map> is generally a good idea in Java."

    How would you solve that?
    class SomeClass extends
    class SomeOtherClass extends

  7. Java will do that to you. I look forward to more clojure posts. The [country player] hash lookup syntax was fun to see as was the destructuring.

    Would clojure allow you to do this (which is just super awesome)? http://olabini.com/blog/2010/03/destructuring-extravaganza/


Note: Only a member of this blog may post a comment.