Jay Fields' Thoughts: September 2012

Thursday, September 27, 2012

Clojure: Refactoring From Thread Last (->>) To Thread First (->)

I use ->> (thread-last) and -> (thread-first) very often. When I'm transforming data I find it easy to break things down mentally by taking small, specific steps, and I find that -> & ->> allow me to easily express my steps.

Let's begin with a (very contrived) example. Let's assume we have user data and we need a list of all users in "new york", grouped by their employer, and iff their employer is "drw.com" then we only want their name - otherwise we want all of the user's data. In terms of the input and the desired output, below is what we have and what we're looking for.

A solution that uses ->> can be found below.

The above example is very likely the first solution I would create. I go about solving the problem step by step, and if the first step takes my collection as the last argument then I will often begin by using ->>. However, after the solution is functional I will almost always refactor to -> if any of my "steps" do not take the result of the previous step as the last argument. I strongly dislike the above solution - using an anonymous function to make update-in usable with a thread-last feels wrong and is harder for me to parse (when compared with the alternatives found below).

The above solution could be refactored to the following solution

This solution is dry, but it also groups two of my three steps together, while leaving the other step at another level. I expect many people to prefer this solution, but it's not the one that I like the best.

The following solution is how I like to refactor from ->> to ->

My preferred solution has an "extra" thread-last, but it allows me to keep everything on the same level. By keeping everything on the same level, I'm able to easily look at the code and reason about what it's doing. I know that each step is an isolated transformation and I feel freed from keeping a mental stack of what's going on in the other steps.

Tuesday, September 25, 2012

Replacing Common Code With clojure.set Function Calls

If you've written a fair amount of Clojure code and aren't familiar with clojure.set, then chances are you've probably reinvented a few functions that are already available in the standard library. In this blog post I'll give a few examples of commonly written code, and I'll show the clojure.set functions that already do everything you need.

Removing elements from a collection is a very common programming task. Sometimes the collection will need to be a vector or a list, and removing an element from the collection will look similar to the example below.

user=> (remove #{1 2} [1 2 3 4 3 2 1])
(3 4 3)

In the cases where you're starting with a list and you want to return a seq, remove is a good solution. However, you may also find yourself starting with a set or looking to return a set.

If you're starting with sets, you'll probably get a performance gain by using clojure.set/difference, and if you're going to need a set returned it's less code and likely more performant to use clojure.set/difference rather than calling clojure.core/set on the results of clojure.core/remove.

clojure.set/difference is simple to use - from the docs

Usage: (difference s1)
       (difference s1 s2)
       (difference s1 s2 & sets)
Return a set that is the first set without elements of the remaining sets

A simple example of using clojure.set/difference can be found below.

user=> (clojure.set/difference #{1 2 3 4 5} #{1 2} #{3})
#{4 5}

Transforming data in clojure is something I do very often. On many occasions I've had a list of maps and I wanted them indexed by 1 or more values. This is fairly easy to do with reduce and update-in, as the example below demonstrates.

user=> (def jay {:name "jay fields" :employer "drw"})
#'user/jay
user=> (def mike {:name "mike jones" :employer "forward"})
#'user/mike
user=> (def john {:name "john dydo" :employer "drw"})
#'user/john
user=> (reduce #(update-in %1 [{:employer (:employer %2)}] conj %2) {} [jay mike john])
{{:employer "forward"} ({:name "mike jones", :employer "forward"}), 
 {:employer "drw"} ({:name "john dydo", :employer "drw"} 
                    {:name "jay fields", :employer "drw"})}

The reduce + update-in combo is a good one, but clojure.set/index is even better - since it's both more concise and doesn't require you to define an anonymous function. clojure.set/index is also very straightforward to use - from the docs

Usage: (index xrel ks)
Returns a map of the distinct values of ks in the xrel mapped to a set 
        of the maps in xrel with the corresponding values of ks.

The example below demonstrates how you can get very similar results to what is above by using clojure.set/index.

user=> (clojure.set/index [jay mike john] [:employer])
{{:employer "forward"} #{{:name "mike jones", :employer "forward"}}, 
 {:employer "drw"} #{{:name "john dydo", :employer "drw"} 
                     {:name "jay fields", :employer "drw"}}}

It is worth noting that the reduce + update-in example has seqs as values and can contain duplicates, and the clojure.set/index example has sets as values and will not contain duplicates. In practice, this has never been an issue for me.

Another common case while working with collections is finding the elements that are in both collections. Since sets are functions (and can be used a predicates), finding common elements is as simple as the following clojure.

user=> (filter (set [1 2 3]) [2 3 4])
(2 3)

Similar to the clojure.set/difference example, if you have lists or vectors in and you want a seq out, you may want to stick to using filter. However, if you are already working with sets or you can easily convert to sets, you'll probably want to take a look at clojure.set/intersection.

Usage: (intersection s1)
       (intersection s1 s2)
       (intersection s1 s2 & sets)
Return a set that is the intersection of the input sets

To get results similar to the above example, simply call clojure.set/intersection in a similar way to the example below.

user=> (clojure.set/intersection #{1 2 3} #{2 3 4})
#{2 3}

In a codebase I was once working on I stumbled upon the following code, which inverts a map.

user=> (reduce #(assoc %1 (val %2) (key %2)) {} {1 :one 2 :two 3 :three})
{:three 3, :two 2, :one 1}

The code is simple enough, but a single function call is always preferable.

Usage: (map-invert m)
Returns the map with the vals mapped to the keys.

The name of the function should be self-explanatory; however, an example is presented below for completeness.

user=> (clojure.set/map-invert {1 :one 2 :two 3 :three})
{:three 3, :two 2, :one 1}

Another common task I find myself doing while working with clojure is trimming data sets. The following code maps over a list of employees and filters out the employer information.

user=> (def jay {:fname "jay" :lname "fields" :employer "drw"})
#'user/jay
user=> (def mike {:fname "mike" :lname "jones" :employer "forward"})
#'user/mike
user=> (def john {:fname "john" :lname "dydo" :employer "drw"})
#'user/john
user=> (map #(select-keys %1 [:fname :lname]) [jay mike john])
({:lname "fields", :fname "jay"} 
 {:lname "jones", :fname "mike"} 
 {:lname "dydo", :fname "john"})

The combination of map + select-keys gets the job done, but clojure.set gives us with one function, clojure.set/project, that provides us with virtually the same result - using less code.

Usage: (project xrel ks)
Returns a rel of the elements of xrel with only the keys in ks

The example below demonstrates the similarity in functionality.

user=> (clojure.set/project [jay mike john] [:fname :lname])
#{{:lname "fields", :fname "jay"} 
  {:lname "dydo", :fname "john"} 
  {:lname "jones", :fname "mike"}}

Similar to clojure.set/index, you'll want to take note of the result being a set and not a list, and just like clojure.set/index, this isn't something that ends up causing a problem in practice.

The rename and rename-keys functions of clojure.set are very similar, and they can both be helpful when you're passing around data-structures that are similar and simply require a few renames to play nicely with existing code.

Below are a few simple examples of how to get things done without rename and rename-keys.

user=> (def jay {:fname "jay" :lname "fields" :employer "drw"})
#'user/jay
user=> (def mike {:fname "mike" :lname "jones" :employer "forward"})
#'user/mike
user=> (def john {:fname "john" :lname "dydo" :employer "drw"})
#'user/john
user=> (map 
         (fn [{:keys [fname lname] :as m}] 
             (-> m 
                 (assoc :first-name fname :last-name lname) 
                 (dissoc :fname :lname))) 
         [jay mike john])
({:last-name "fields", :first-name "jay", :employer "drw"} 
 {:last-name "jones", :first-name "mike", :employer "forward"} 
 {:last-name "dydo", :first-name "john", :employer "drw"})

user=> (reduce #(assoc %1 ({1 "one" 2 "two"} (key %2)) (val %2)) {} {1 :one 2 :two})
{"two" :two, "one" :one}

The rename & rename-keys functions are very straightforward, and you can find their documentation and example usages below.

Usage: (rename xrel kmap)
Returns a rel of the maps in xrel with the keys in kmap renamed to the vals in kmap

Usage: (rename-keys map kmap)
Returns the map with the keys in kmap renamed to the vals in kmap

user=> (clojure.set/rename [jay mike john] {:fname :first-name :lname :last-name})
#{{:last-name "jones", :first-name "mike", :employer "forward"} 
  {:last-name "dydo", :first-name "john", :employer "drw"} 
  {:last-name "fields", :first-name "jay", :employer "drw"}}

user=> (clojure.set/rename-keys {1 :one 2 :two} {1 "one" 2 "two"})
{"two" :two, "one" :one}

If you've gotten this far, I'll assume you already understand how to use filter. The clojure.set namespace has a function that's very similar to filter, but it returns a set. If you don't need a set, you're better off sticking with filter; however, if you're working with sets, you might save yourself a few keystrokes and microseconds by using clojure.set/select instead.

Below is a the documentation and an example.

Usage: (select pred xset)
Returns a set of the elements for which pred is true

user=> (clojure.set/select odd? #{1 2 3 4})
#{1 3}

The clojure.set/subset? and clojure.set/superset? functions are also functions that are straightforward to use, and probably don't benefit from an example of how to create the same results on your own. However, I will provide the docs and 2 brief examples of their usage.

Usage: (subset? set1 set2)
Is set1 a subset of set2?

Usage: (superset? set1 set2)
Is set1 a superset of set2?

user=> (clojure.set/superset? #{1 2 3} #{2 3})
true
user=> (clojure.set/subset? #{1 2} #{1 2 3})
true

The final function I will document is clojure.set/union. If you needed a list of the unique elements resulting from combining 2 or more lists, you could get the job done with a combination of concat, reduce, and/or set. The example below shows how to do things without using the set function or a set data-structure. note: Using a set would likely be both more efficient and more readable. This example is designed to show that you could do things without sets, but I do not recommend that you code in this way.

(reduce 
  #(if (some (partial = %2) %1) %1 (conj %1 %2)) 
  [] 
  (concat [1 2 1] [2 4 3 1])) 
[1 2 4 3]

Truthfully, I don't tend to think about 'union' unless I'm already thinking about sets. In Clojure, clojure.set/union is defined to take multiple sets and return the union of each of those sets (as you'd expect).

Usage: (union)
       (union s1)
       (union s1 s2)
       (union s1 s2 & sets)
Return a set that is the union of the input sets

Finally, the example below shows the union function in action.

user=> (clojure.set/union #{1 2} #{2 4 3 1})
#{1 2 3 4}

The clojure.set namespace does define one additional function, clojure.set/join. To be honest, I haven't used join in production and I don't believe that I'm writing my own inferior versions within my codebases. So, I don't have an example for you, but I do like the examples on clojuredocs.org and I would encourage you to go check them out: http://clojuredocs.org/clojure_core/1.2.0/clojure.set/join

Monday, September 17, 2012

emacs lisp: removing a lamba hook

disclaimer: I know almost nothing about emacs lisp, so please forgive any mistakes or incorrect assumptions.

I've been using Clojure for over 4 years at this point, but it's generally been on projects that are mostly Java with a few small components in Clojure. Given that context my teammates preferred that we stick with IntelliJ; however, things have recently changed and the emacs journey has begun.

Taking emacs from 'factory settings' to 'impressive' for me was as easy as getting started with emacs-live.

Emacs Live can be summarized as:
a nice structured approach to organising your Emacs config

modular in that functionality is organised by discrete packs

Indeed, Emacs Live does structure your config nicely; however, in it's own words: Emacs Live is also an opinionated set of defaults. I like all of the defaults... except one: rainbow-delimiters

I get that people like rainbow-delimeters, and I would never try to convince you not to use them - they're just not for me, and I needed to find out how to get rid of them.

Emacs Live adds rainbow-delimiters in two different ways. The following code shows how Emacs Live uses add-hook to enable rainbow delimiters for scheme, emacs-lisp, & lisp.

(dolist (x '(scheme emacs-lisp lisp))
  (add-hook 
    (intern (concat (symbol-name x) "-mode-hook"))
    'rainbow-delimiters-mode))

The code required to remove rainbow-delimiters from scheme, emacs-lisp, & lisp is very straightforward, and can be found below.

(dolist (x '(scheme emacs-lisp lisp))
  (remove-hook 
    (intern (concat (symbol-name x) "-mode-hook"))
    'rainbow-delimiters-mode))

As you can see, replacing add-hook with remove-hook will remove the hook that Emacs Live added for me. Since Emacs Live loads my personal settings last, my remove should successfully work every time. It's a best practice that you create hooks that can be run in any order - and this change is obviously order specific; however, I can't think of a way to follow the hook best practice without hacking the emacs-live checkout. Therefore, it seems like this solution is the most pragmatic.

The next snippet of code is how Emacs Live adds rainbow delimiters (& a few other things) to clojure-mode.

(add-hook 'clojure-mode-hook
          (lambda ()
            (enable-paredit-mode)
            (rainbow-delimiters-mode)
            (add-to-list 'ac-sources 'ac-source-yasnippet)
            (setq buffer-save-without-query t)))

The previous hook was easy to remove, since I knew exactly what function I needed to remove. This hook is more of a problem, since it's anonymous. Additionally, this function is defined within Emacs Live code, so simply changing it to a named function isn't an option (since I don't want to modify the emacs-live checkout).

Finding a solution wasn't very hard. The first thing I did was try to get a list of the functions that will be fired by the hook. This is as simple as printing, as the following code shows.

(print clojure-mode-hook)

I put that simple print statement in my .emacs-live.el, restarted emacs, went to the *Messages* buffer, and found the following line printed out.

((lambda nil 
         (enable-paredit-mode)
         (rainbow-delimiters-mode)
         (add-to-list (quote ac-sources) (quote ac-source-yasnippet)) 
         (setq buffer-save-without-query t)))

As you can see, clojure-mode-hook has a list of the functions that have been added via add-hook. With this information, I added the following code, which first removes the existing hook and then adds a new anonymous function with everything previously specified - sans rainbow-delimiters.

(remove-hook 'clojure-mode-hook (first clojure-mode-hook))

(add-hook 'clojure-mode-hook
          (lambda ()
            (enable-paredit-mode)
            (add-to-list 'ac-sources 'ac-source-yasnippet)
            (setq buffer-save-without-query t)))

The above snippet removes my unwanted lambda hook and adds a new hook with everything I do want.

Is this fragile? You bet. If things change in Emacs Live, I'll need to mirror those changes in my .emacs-live.el - which is why it's recommended that you don't rely on ordering when adding and removing hooks. However, given the situation, this seems like a pragmatic solution.

Feedback definitely welcome.