Tuesday, August 23, 2011

Clojure: partition-by, split-with, group-by, and juxt

Today I ran into a common situation: I needed to split a list into 2 sublists - elements that passed a predicate and elements that failed a predicate. I'm sure I've run into this problem several times, but it's been awhile and I'd forgotten what options were available to me. A quick look at http://clojure.github.com/clojure/ reveals several potential functions: partition-by, split-with, and group-by.

partition-by
From the docs:
Usage: (partition-by f coll)

Applies f to each value in coll, splitting it each time f returns
a new value. Returns a lazy seq of partitions.
Let's assume we have a collection of ints and we want to split them into a list of evens and a list of odds. The following REPL session shows the result of calling partition-by with our list of ints.
user=> (partition-by even? [1 2 4 3 5 6])

((1) (2 4) (3 5) (6))
The partition-by function works as described; unfortunately, it's not exactly what I'm looking for. I need a function that returns ((1 3 5) (2 4 6)).

split-with
From the docs:
Usage: (split-with pred coll)

Returns a vector of [(take-while pred coll) (drop-while pred coll)]
The split-with function sounds promising, but a quick REPL session shows it's not what we're looking for.
user=> (split-with even? [1 2 4 3 5 6])

[() (1 2 4 3 5 6)]
As the docs state, the collection is split on the first item that fails the predicate - (even? 1).

group-by
From the docs:
Usage: (group-by f coll)

Returns a map of the elements of coll keyed by the result of f on each element. The value at each key will be a vector of the corresponding elements, in the order they appeared in coll.
The group-by function works, but it gives us a bit more than we're looking for.
user=> (group-by even? [1 2 4 3 5 6])

{false [1 3 5], true [2 4 6]}
The result as a map isn't exactly what we desire, but using a bit of destructuring allows us to grab the values we're looking for.
user=> (let [{evens true odds false} (group-by even? [1 2 4 3 5 6])]

[evens odds])
[[2 4 6] [1 3 5]]
The group-by results mixed with destructuring do the trick, but there's another option.

juxt
From the docs:
Usage: (juxt f)
              (juxt f g)
              (juxt f g h)
              (juxt f g h & fs)

Alpha - name subject to change.
Takes a set of functions and returns a fn that is the juxtaposition
of those fns. The returned fn takes a variable number of args, and
returns a vector containing the result of applying each fn to the
args (left-to-right).
((juxt a b c) x) => [(a x) (b x) (c x)]
The first time I ran into juxt I found it a bit intimidating. I couldn't tell you why, but if you feel the same way - don't feel bad. It turns out, juxt is exactly what we're looking for. The following REPL session shows how to combine juxt with filter and remove to produce the desired results.
user=> ((juxt filter remove) even? [1 2 4 3 5 6])

[(2 4 6) (1 3 5)]
There's one catch to using juxt in this way, the entire list is processed with filter and remove. In general this is acceptable; however, it's something worth considering when writing performance sensitive code.
Post a Comment