Jay Fields' Thoughts: clojure

Showing posts with label clojure. Show all posts

Monday, January 27, 2014

REPL Driven Development

When I describe my current workflow I use the TLA RDD, which is short for REPL Driven Development. I've been using REPL Driven Development for all of my production work for awhile now, and I find it to be the most effective workflow I've ever used. RDD differs greatly from any workflow I've used in the past, and (despite my belief that it's superior) I've often had trouble concisely describing what makes the workflow so productive. This entry is an attempt to describe what I consider RDD to be, and to demonstrate why I find it the most effective way to work.

RDD Cycle

First, I'd like to address the TLA RDD. I use the term RDD because I'm relying on the REPL to drive my development. More specifically, when I'm developing, I create an s-expression that I believe will solve my problem at hand. Once I'm satisfied with my s-expression, I send that s-expression to the REPL for immediate evaluation. The result of sending an s-expression can either be a value that I manually inspect, or it can be a change to a running application. Either way, I'll look at the result, determine if the problem is solved, and repeat the process of crafting an s-expression, sending it to the REPL, and evaluating the result.

If that isn't clear, hopefully the video below demonstrates what I'm talking about.

If you're unfamiliar with RDD, the previous video might leave you wondering: What's so impressive about RDD? To answer that question, I think it's worth making explicit what the video is: an example of a running application that needs to change, a change taking place, and verification that the application runs as desired. The video demonstrates change and verification; what makes RDD so effective to me is what's missing: (a) restarting the application, (b) running something other than the application to verify behavior, and (c) moving out of the source to execute arbitrary code. Eliminating those 3 steps allows me to focus on what's important, writing and running code that will be executed in production.

Feedback

I've found that, while writing software, getting feedback is the single largest time thief. Specifically, there are two types of feedback that I want to get as quickly as possible: (1) Is my application doing what I believe it is? (2) What does this arbitrary code return when executed? I believe the above video demonstrates how RDD can significantly reduce the time needed to answer both of those questions.

In my career I've spent significant time writing applications in C#, Ruby, & Java. While working in C# and Java, if I wanted to make and verify (in the application) any non-trivial change to an application, I would need to stop the application, rebuild/recompile, & restart the application. I found the slowness of this feedback loop to be unacceptable, and wholeheartedly embraced tools such as NUnit and JUnit.

I've never been as enamored with TDD as some of my peers; regardless, I absolutely endorsed it. The Design aspect of TDD was never that enticing to me, but tests did allow me to get feedback at a significantly superior pace. Tests also provide another benefit while working with C# & Java: They're the poorest man's REPL. Need to execute some arbitrary code? Write a test, that you know you're going to immediately delete, and execute away. Of course, tests have other pros and cons. At this moment I'm limiting my discussion around tests to the context of rapid feedback, but I'll address TDD & RDD later in this entry.

Ruby provided a more effective workflow (technically, Rails provided a more effective workflow). Rails applications I worked on were similar to my RDD experience: I was able to make changes to a running application, refresh a webpage and see the result of the new behavior. Ruby also provided a REPL, but I always ran the REPL external to my editor (I knew of no other option). This workflow was the closest, in terms of efficiency, that I've ever felt to what I have with RDD; however, there are some minor differences that do add up to an inferior experience: (a) having to switch out of a source file to execute arbitrary code is an unnecessary nuisance and (b) refreshing a webpage destroys any client side state that you've built up. I have no idea if Ruby now has editor & repl integration, if it does, then it's likely on par with the experience I have now.

Semantics

It's important to distinguish between two meanings of "REPL" - one is a window that you type forms into for immediate evaluation; the other is the process that sits behind it and which you can interact with from not only REPL windows but also from editor windows, debugger windows, the program's user interface, etc.
It's important to distinguish between REPL-based development and REPL-driven development:

REPL-based development doesn't impose an order on what you do. It can be used with TDD or without TDD. It can be used with top-down, bottom-up, outside-in and inside-out approaches, and mixtures of them.
REPL-driven development seems to be about "noodling in the REPL window" and later moving things across to editor buffers (and so source files) as and when you are happy with things. I think it's fair to say that this is REPL-based development using a series of mini-spikes. I think people are using this with a bottom-up approach, but I suspect it can be used with other approaches too.
-- Simon Katz

I like Simon's description, but I don't believe that we need to break things down to two different TLAs. Quite simply, (sadly) I don't think enough people are developing in this way, and the additional specification causes a bit of confusion among people who aren't familiar with RDD. However, Simon's description is so spot on I felt the need to describe why I'm choosing to ignore his classifications.

RDD & TDD

RDD and TDD are not in direct conflict with each other. As Simon notes above, you can do TDD backed by a REPL. Many popular testing frameworks have editor specific libraries that provide immediate feedback through REPL interaction.

When working on a feature, the short term goal is to have it working in the application as fast as possible. Arbitrary execution, live changes, and only writing what you need are 3 things that can help you complete that short term goal as fast as possible. The video above is the best example I have of how you go from a feature request to software that does what you want in the smallest amount of time. In the video, I only leave the buffer to verify that the application works as intended. If the short term goal was the only goal, RDD without writing tests would likely be the solution. However, we all know that that are many other goals in software. Good design is obviously important. If you think tests give you better design, then you should probably mix both TDD & RDD. Preventing regression is also important, and that can be accomplished by writing tests after you have a working feature that you're satisfied with. Regression tests are great for giving confidence that a feature works as intended and will continue to in the future.

REPL Driven Development doesn't need to replace your current workflow, it can also be used to extend your existing TDD workflow.

Tuesday, June 11, 2013

Coding: Increase Your Reading and Writing Speed

A teammate of mine recently expressed a desire for a shortcut for something we type often. I started looking into our shortcut options and came to a common determination: We can do this, but the number of 2 key shortcuts available to us is finite, so we better use them wisely.

I wrote the following unix to give me a rough idea of what we type frequently.

find . -name "*.clj" | xargs cat | tr -s '[:space:]:#()[]{}\"' '\n' | sort | uniq -c | sort -n

note: If you're not writing clojure you'll want to look for something other than .clj files, and you might also want to tweak what you replace with a new line.

The above unix gave me an ordered list of the most typed 'words' across all of my codebases. At this point I had some science for setting up some shortcuts.

Writing

You'll want to look into whatever editor/ide you use and see if you can find key shortcuts and snippet expansion. My editor is emacs; I assigned some key-chords and some yasnippets. If you're not using emacs you should have something similar in whatever you are using.

While I wanted to define some shortcuts, I also didn't want to create so many that I was constantly wasting time looking up what I'd created. Based on that desire I created:

2 shortcuts (key-chords) for two of the most duplicated words. The shortcuts are concise by design, but that makes them a bit harder to remember. You can probably get started with more than 2, but I didn't see much harm in starting there.
a dozen snippets for the next most used words. These snippets are descriptive enough to easily remember, thus I felt comfortable defining several of them. e.g. pps expands to (println (pr-str )).

Having shortcuts and snippets will obviously make me more productive, and the unix helped me figure out which words were the most important to optimize for.

Reading

Most editors/ides also give you a summary view for common code patterns. For example, IntelliJ displays lambdas when the actual code is actually an anonymous class. Emacs gives you font-lock for turning patterns into individual characters. Armed with my list of the most common words in my codebase, I created font-locks for the 11 most duplicated.

Sometimes a picture is worth a thousand words. Below is a function definition without any font locks applied.

The following image is what the same function looks like with custom font locks applied. It might be a bit jarring at first, but in the long run it should add up to many small victories such as quickly identifying patterns in code and less line breaks.

Results

I've been working with these settings for a little over a week now. I haven't needed to look up anything I defined, and I get a little burst of satisfaction when I read or write something faster than I'd been able to in the past. I'd definitely recommend doing something similar with your codebase and ide/editor.

Thursday, May 16, 2013

Clojure: Combining Calls To Doseq And Let

I've you've ever looked at the docs for clojure's for macro, then you probably know about the :let, :when, and :while modifiers. What you may not know is that those same modifiers are available in doseq.

I was recently working with some code that had the following form.

Upon seeing this code, John Hume asked if I preferred it to a single doseq with multiple bindings. He sent over an example that looked similar to the following example.

That was actually the first time that I'd seen multiple bindings in a doseq, and my immediate reaction was that I preferred the explicit simplicity of having multiple doseqs. However, I always have a preference for concise code, and I forced myself to starting using multiple bindings instead of multiple doseqs - and, unsurprisingly, I now prefer multiple bindings to multiple doseqs.

You might have noticed that the second version of the code slightly changes what's actually being done. In the original version the 'name' function is called once per 'id', and in the second version the 'name' function is called once per 'sub-id'. Calling name significantly more often isn't likely to have much impact on your program; however, if you were calling a more expensive function this change could have a negative impact. Luckily, (as I previously mentioned) doseq also provides support for :let.

The second example can be evolved to the following code - which also demonstrates that the let is only evaluated once per iteration.

That's really the final version of the original code, but you can alter it slightly for experimentation purposes if you'd like. Let's assume we have another function we're calling in an additional let and it's expensive, it would be nice if that only occurred when an iteration was going to happen. It turns out, that's exactly what happens.

Whether you prefer multiple bindings or multiple doseqs, it's probably a good idea to get comfortable reading both.

Wednesday, May 15, 2013

Emacs Lisp: Font Lock for Clojure's Partial

I love using partial, but I dislike the length of the function name. There's a simple solution, define another function with a shorter name that simply calls (or is) partial. This is exactly what I did in the jry library.

I liked the use of % due to partial feeling similar to creating a function using #(), and % having a special meaning inside #(). I thought they tied well together. Unfortunately, there's an obvious problem, things would be very broken if you tried to use the '%' function in an anonymous function defined with #(). Somewhere along the way this issue caused me to stop using jry/%.

Using partial is great: it's part of the standard lib, and I don't need to explain it to anyone who joins my team or any future maintainers of the code I write. Still, I want something shorter, and I've always had a background thread looking for another shorter-than-partial solution. While recently contributing to emacs-live I found the solution I was looking for: clojure-mode font lock.

The following code can now be found in my emacs configuration.

This solution feels like the best of both worlds. My code still uses the function from the standard library, my colleagues still see a function they already know, and 'partial' only takes up one character space in my buffer. The image below is what you'll see if you put the above emacs-lisp in your config.

Tuesday, May 14, 2013

Clojure: Testing The Creation Of A Partial Function

I recently refactored some code that takes longs from two different sources to compute one value. The code originally stored the longs and called a function when all of the data arrived. The refactored version partials the data while it's incomplete and executes the partial'd function when all of the data is available. Below is a contrived example of what I'm taking about.

Let's pretend we need a function that will allow us to check whether or not another drink would make us legally drunk in New York City.

The code below stores the current bac and uses the value when legally-drunk? is called.

The following (passing) tests demonstrate that everything works as expected.

This code works without issue, but can also be refactored to store a partial'd function instead of the bac value. Why you would want to do such a thing is outside of the scope of this post, so we'll just assume this is a good refactoring. The code below no longer stores the bac value, and instead stores the pure-legally-drunk? function partial'd with the bac value.

Two of the three of the tests don't change; however, the test that was verifying the state is now broken.

note: The test output has been trimmed and reformatted to avoid horizontal scrolling.

In the output you can see that the test is failing as you'd expect, due to the change in what we're storing. What's broken is obvious, but there's not an obvious solution. Assuming you still want this state based test, how do you verify that you've partial'd the right function with the right value?

The solution is simple, but a bit tricky. As long as you don't find the redef too magical, the following solution allows you to easily verify the function that's being partial'd as well as the arguments.

Those tests all pass, and should provide security that the legally-drunk? and update-bac functions are sufficiently tested. The pure-legally-drunk? function still needs to be tested, but that should be easy since it's a pure function.

Would you want this kind of test? I think that becomes a matter of context and personal preference. Given the various paths through the code the following tests should provide complete coverage.

The above tests make no assumptions about the implementation - they actually pass whether you :use the 'original namespace or the 'refactored namespace. Conversely, the following tests verify each function in isolation and a few of them are very much tied to the implementation.

Both sets of tests would give me confidence that the code works as expected, so choosing which tests to use would become a matter of maintenance cost. I don't think there's anything special about these examples; I think they offer the traditional trade-offs between higher and lower level tests. A specific trade-off that stands out to me is identifying defect localization versus having to update the test when you update the code.

As I mentioned previously, the high-level-expectations work for both the 'original and the 'refactored namespaces. Being able to change the implementation without having to change the test is obviously an advantage of the high level tests. However, when things go wrong, the lower level tests provide better feedback for targeting the issue.

The following code is exactly the same as the code in refactored.clj, except it has a 1 character typo. (it's not necessary to spot the typo, the test output below will show you want it is)

The high level tests give us the following feedback.

failure in (high_level_expectations.clj:14) : expectations.high-level-expectations
(expect
 true
 (with-redefs
  [state (atom {})]
  (update-bac 0.01)
  (legally-drunk? 0.07)))

           expected: true 
                was: false

There's not much in that failure report to point us in the right direction. The unit-level-expectations provide significantly more information, and the details that should make it immediately obvious where the typo is.

failure in (unit_level_expectations.clj:8) : expectations.unit-level-expectations
(expect
 {:legally-drunk?* [pure-legally-drunk? 0.04]}
 (with-redefs [state (atom {}) partial vector] (update-bac 0.04)))

           expected: {:legally-drunk?* [# 0.04]} 
                was: {:legally-drunk?** [# 0.04]}
 
           :legally-drunk?** with val [# 0.04] 
                             is in actual, but not in expected
           :legally-drunk?* with val [# 0.04] 
                            is in expected, but not in actual

The above output points us directly to the extra asterisk in update-bac that caused the failure.

Still, I couldn't honestly tell you which of the above tests that I prefer. This specific example provides a situation where I think you could convincingly argue for either set of tests. However, as the code evolved I would likely choose one path or the other based on:

how much 'setup' is required for always using high-level tests?
how hard is it to guarantee integration using primarily unit-level tests?

In our examples the high level tests require redef'ing one bit of state. If that grew to a few pieces of state and/or a large increase in the complexity of the state, then I may be forced to move towards more unit-level tests. A rule of thumb I use: If a significant amount of the code within a test is setting up the test context, there's probably a smaller function and a set of associated tests waiting to be extracted.

By definition, the unit-level tests don't test the integration of the various functions. When I'm using unit-level tests, I'll often test the various code paths at the unit level and then have a happy-path high-level test that verifies integration of the various functions. My desire to have more high-level tests increases as the integration complexity increases, and at some point it makes sense to simply convert all of the tests to high-level tests.

If you constantly re-evaluate which tests will be more appropriate and switch when necessary, you'll definitely come out ahead in the long run.

Thursday, May 02, 2013

Emacs Lisp: Toggle Between a Clojure String and Keyword

When I was doing a fair bit of Ruby I often used the TextMate's shortcut (Ctrl+:) to convert a Ruby String to a Symbol or a Ruby Symbol to a String. It's something I've periodically missed while doing Clojure, and yesterday I found myself in the middle of a refactoring that was going to force the conversion of 5+ Clojure Keywords to Strings.

The following emacs lisp is my solution for toggling between Clojure Strings and Keywords. The standard disclaimers apply - it works on my machine, and I've never claimed to know emacs lisp well.

A quick video of the behavior:

Tuesday, April 30, 2013

Year Five

The average lifespan for a software engineering job is 4 years. Okay, I've never actually seen proof (or contradiction), but that's the general feeling in the groups I associate with. Perhaps that's selection bias - my employer has generally changed on year 3 or 4. Perhaps this is the exception and not the rule, in that case feel free to simply read this as an experience report. However, I do think it's somewhat common for developers to leave around year 3 or 4. This entry contains speculation on why they leave, and offers one idea on what employers can do to break that cycle.

My 4 year employment cycle generally looks like this

Year One: "I'm in over my head. My semi-bluff was in-fact a bluff. They're going to fire me any day."
Year Two: "It's nice to feel like a productive team member"
Year Three: "This is fun, and I'm not bad at it. It's satisfying to pass on knowledge to teammates."
Year Four: "This feels repetitive, that grass over there sure looks greener"

I expect that I, like many programmers, probably undervalue my contribution in the early days and overvalue my contribution in the latter days.

In Year Three and Four at DRW I spent some time thinking about how I felt, and observing the behavior of some colleagues that were also on year three and four. A few things stood out to me.

A company you don't work at always seems to have infinite possibilities; however, after a few years with an employer, it's extremely clear what your options are. More importantly, it's very clear what limitations will likely always be there.
A company you don't work at contains no code you're responsible for. Conversely, any company you've been with for 4 years probably has plenty of code you're not proud of. If you're responsible for that code, it's a constant reminder of your previous limitations. If you're not responsible for it, your co-workers aren't likely to let you forget about it anytime soon.
There's always someone willing to pay you more than you're worth. After several years with a company it's likely that they're going to pay you what you're worth, but not what some other company thinks you're worth. I'm surprised that more companies don't pay (the employees they want to keep) what their "flawless market value" would be. In other words, what would you pay them if they interviewed, you determined what they knew, you determined what value they would bring, and you were completely ignorant of their flaws? That's what your competition is likely doing. That's what you're fighting against if you want to keep them around.
A new job often offers a new challenge. Once you feel like you've given that challenge your best shot, what remains? If you did a great job, it's likely that you'll have plenty of other options. However, if you've done a good job, you may be stuck in a spot where there aren't as many open doors and challenges to choose from - not nearly as many as a position at another company will appear to offer.

I was recently in Punta Cana for wedding, and I was on the beach - working on my laptop. My wife asked: don't you want some time off? My response was short and immediate: no. Later that evening my wife and I discussed my work situation. I observed that I'm in Year Five at DRW and I'm happy, happier than ever, strange - given my previous experiences. She asked if I thought that I was working too much, and if I thought that I would burn out. I remarked: I'd rather have a job that I love, that I don't like to be away from, than a job where I feel like I need a week or two off.

I hear you, nice work if you can get it. I don't have a general recipe for getting there, but I know how I got there.

Back in 2009 I interviewed at DRW. At the time I was working for ThoughtWorks, and my client was Forward. I considered the founder of Forward to be a friend and someone I would gladly work for. I decided it was time to leave ThoughtWorks (after 3.5 years), and I was sure that Forward would be my future home. I remarked to my DRW recruiter "H" (who also happened to be a friend from my ThoughtWorks days) that one of the best things about Forward was knowing that I liked and trusted the man who ran Forward. H said nothing, but made a brilliant move.

In my interview I was grilled, killed even, and then things turned. I met with a guy who asked me a few questions and then told me about the company: the vision, the people, and where I could fit in. He was smart, easy to talk to, and someone I related to. We discussed things casually, it didn't feel like a company pitch in any way at all, it felt like small-talk - something I was very grateful for after the beating I'd taken previously in of the day. After everything concluded I hit the bar with my friends, including H. At that point they revealed to me that the guy I'd met was the partner at the firm that was (among other things) responsible for the firm's technology. I'd also met the CTO, and various other people responsible for technology in the firm. H had shown me that DRW, just like Forward, had what I like to call Awesome All the Way Up.**

Awesome All the Way Up has served me very well at DRW. To this day I remain in fairly common contact with the CTO and several of DRW's partners. About 6 months ago I asked 3 favors. First of all, I asked for enough money to pay someone's salary for 6 months. I identified a project that I wanted to undertake, and I needed help to complete it. Then things got unconventional, I asked if I could create a contract-to-hire situation. Even more unconventional, I pursued a friend and previous colleague who lived in Austin, Texas. DRW rarely uses contractors, and has no other remote employees that I'm aware of. An appropriate amount of questions were asked, but in the end my request was granted.

The experiment is on-going, but I'm very happy with our progress so far. That's all well-and-good, but the support of DRW is the important aspect of the story. I'm confident that their support of my unconventional requests was a major factor in ensuring my happiness in Year Five. We recently hired John Hume, thus declaring success at some level already. However, if things had gone poorly, both parties could have gone their separate ways with little lost and lessons learned. More importantly to me, DRW would have continued to give me confidence that they were willing to take chances to provide me with opportunities and ensure my continued happiness at the firm.

There's a similar discussion around DRW allowing me to use Clojure as my primary development language. I'll spare you the long version. tl; dr: They gave me a reasonable amount of space to try something new, and supported me appropriately as we found more and more success.

Not all of my experiments are green-lighted, and I've also had unsuccessful outcomes. DRW has done a good job of not setting me up to fail; my ideas that have a low probability of succeeding are fleshed out and appropriately shot down. All experiments have risk measures put in place, limited downside, and are reassessed constantly. It's great to have support when things are going well, and it's essential to have support when things don't go as planned.

For me, that's been the secret for keeping me around more than 4 years: An appropriate amount of trust and a willingness to experiment.

A foreign thought also recently came to mind. For the first time in my life I can say that I see myself happy and successful at my current employer in 10 years. This is a question I've asked many people since it occurred to me. To date, +AdeOshineye (http://www.oshineye.com/) is the only person who's responded affirmatively. The results aren't surprising to me, but I do wonder why more employees and employers aren't looking for ways to extend relationships.

Perhaps the secret for keeping me around isn't more broadly applicable; however, simply asking what will keep an individual around is probably the more important message in this entry. It's good to know what will make someone happy now, but it seems like it's equally important to know what will make them happy in the long term. I suspect the answers will be at least a little, if not very different.

The way things currently stand, I'm looking forward to writing about Year Six.

** DRW became my home in the end; however, Forward continues to do well. I suspect Awesome All the Way Up would have ensured happy and gainful employment at either destination. I remain in regular contact with my friends at Forward.

Tuesday, April 02, 2013

Emacs Lisp: Find Java Sources

Confession: I really hope someone can tell me I'm doing this wrong. I can't believe there isn't an easier way.

I work with Clojure, in Emacs, almost every day. Navigating the source is usually fairly easy. If I want to navigate to a function definition, all I need to press is M-., and if I want to navigate back, M-, does the trick. This works for Clojure that I've written, as well as Clojure that lives in the libraries that I reference. That's fine the vast majority of the time, but occasionally I need to navigate to the Java source of some library I'm using. This is where I can't believe that no one else has solved this problem.* If I'm in a clj file, my cursor is on a Java class, I don't know of any way to easily navigate to the class definition.

Context: I use fig for dependency management (at work). I have my projects set to put sources in ./the-project/lib/sources; therefore, the following solution assumes the sources are in that directory. If you use Lein (for deps), I'm sure the sources are on your local drive somewhere. All you need to do is change the lisp below to point to your source dir.

Additionally, I use Lein, so I have a project.clj at the root of my project; however, there's nothing special or required about "project.clj". You could just as easily put a this.is.a.project.root file in the root of your project, and search for that file.

The following code will search the your source jars for a Java class, and open the first match that it finds (or no match, if no match is found)

disclaimer, I'm still an emacs lisp beginner.

The previous code is pretty straightforward. Line 3 uses expand region to mark whatever Java class my cursor is currently on or near. You could type the word instead if you like, this page should help you understand how.
Line 4 and 6 find and verify where the project root lives.
Line 8 (and 9) greps for a string (the Java class) in a directory (my source dir).
Line 10 switches to the grep results.
Line 11 sleeps, waiting for the grep results to return.
Line 12 searches forward, looking for the first match.
Line 13 opens the jar of the first match.
Line 14 assigns the current point to a var named 'current-point'.
Line 15 searches forward to the end of the jar name.
Line 16 grabs the name of the jar and switches to that buffer.
Line 17 searches the jar's dired buffer for the name of the class.
Line 18 opens the first class name match.
(Line 19 lets you know if your project root could not be determined)

That's it, you can now easily find java source (with the occasional conflicting name annoyance). It's not pretty, but it gets the job done.

* I've been told that a few Java modes are good, if I can easily use those to navigate from my Clojure to Java, please leave a link to a manual I can dig into. I assume there are etags solutions, but it's not clear what the best way to go is. I'm sure there's an easy solution for navigating Java from Clojure, I'm just having a hard time finding it.

Tuesday, March 19, 2013

Clojure: Expectations Interaction Tests For Java Objects

I recently ran into some code that forced me to integrate with a Java library. While using the library I found myself wanting to do a bit of interaction testing, which I've historically done with Mockito. As a result, I added the ability to do interaction based tests on mock Java objects, directly in expectations.

Hopefully the code is what you'd expect.

The previous example creates a mock Runnable in an expect-let, expects the run method to be run, and then calls the run method of the mock. This test is worthless in a real world context, but it's the simplest way to demonstrate the syntax for creating a mock & specifying the interaction.

The mock function defined in erajure, a minimal wrapper around mockito. All of the "times" arguments are the same as what's available for function interaction tests, examples can be found here.

Tuesday, February 26, 2013

Synchronizing Snapshots and Incrementals With Single Threading

Code available on: https://github.com/jaycfields/snapshot-incremental-synchronize

Many of the applications that I write these days have a lot of data - so much that there's no reasonable way to continually send all of it. Instead, most of the applications I work with will have the ability to receive a snapshot of the current state, and the ability to receive deltas (incrementals) that must be applied to the previous snapshot. To further complicate things, incomplete data is unacceptable and ordering matters. This type of environment breeds many solutions for synchronizing snapshots and incrementals. This entry is about using single threading (via jetlang) for synchronization and guaranteed accuracy.

Let's take a very simple example, you have two processes a client and server. The server has a list and the client needs to display that list - completely and in order. The list on the client also needs to be updated whenever the list on the server is updated.

There are several issues that you could encounter in a multithreaded environment.

If you request a snapshot and then start listening to incrementals, you may miss data that isn't in the snapshot, but was broadcast before you started listening to incrementals
If you start listening to incrementals and request a snapshot at the same time, you may apply an incremental to the snapshot, even though the snapshot already reflects the incremental.
If you start listening to the incrementals first, you'll need some way to throw away the incrementals that are already reflected in the snapshot.

It's time to get into some code.

Here's some simple server code.

The above code contains a server-list, which is a list that represents the ordered random numbers being generated on the server side. Our task is to mirror this list in our client. The appending scheduled task and appending fiber are stored to allow for easy starting and stopping of appending. The server-start and server-stop functions are provided for convenience, should you choose to run this example locally.

The subscriber atom and the subscribe function are a simple way for a client to subscribe to snapshots and incrementals. The publish-to-client function derefs a fn and immediately calls it with a snapshot or incremental. In a prod application, publish and subscribe logic would probably involve a socket or messaging system - our solution is purposefully naive, to focus on the point of the post: synchronization.

The get-snapshot function publishes the current state of the the server-list to a client. The append-to-list function is removing elements so it's easy to see the server-list changing - without the data growing to an unmanageable size, in prod this would (likely) not exist; however, the rest of the code in append-to-list is fairly representative of a common practice - generate a delta, apply it to the local list and publish it out to clients.

Looking at this code, it's easy to see that one fiber is appending to the list and publishing to the client, while another fiber would return the value of get-snapshot. This code can work, but the way it's currently written data accuracy cannot be guaranteed.

Let's look at some client code.

The client-start function subscribes to server updates, and then requests a snapshot. The handle update function resets a client-list on snapshot and conjs an incremental to the existing list. (note: the client list is kept at 10 elements for simplicity, just like the server - I would not expect this type of code to be in prod).

Below is a full snapshot of the current code.

The client and server code is the same as above, but this example also contains some function calls in a comment. At this point you can paste this code into your favorite editor, start the client and the server and inspect both lists. The update frequency is so large that you can even compare the two lists, and it's highly likely that they are equal.

For a lot of problems this code may be sufficient; however, as we noted above, there is definitely an opportunity for you to see invalid state. With this specific code the append fiber could update the atom with an incremental X, on the main fiber get-snapshot could deref a snapshot with X included (and publish it) and then the append fiber could also publish the incremental X. Luckily there's a simple solution, publish the snapshot, update the server-list, and publish the incrementals all on the same fiber.

The code below shows how easy it is to create a jetlang fiber and execute an anonymous function.

As you can see, very little changed with the code. We've defined another fiber, synchro-fiber, which we will use to single thread our updates to server-list and our publishes to the client. The synchro-fiber will execute the runnables (in our example, anonymous functions) that are put on it's queue, in order. The body of get-snapshot and append-to-list were slightly modified to call the execute function with their previous body as an anonymous function. Other technical differences are also true - the code isn't immediately run, it's no longer blocking, and the return value has been altered. While all of these observations are true, they are irrelevant with respect to what we were trying to accomplish.

Using jetlang fibers we've accomplished our goal - we can guarantee that snapshots and incrementals will be easy to synchronize (without sequence ids), accurate, and in order. Of course, you'll need to consume both of these messages on a single fiber as well, but that should be equally easy to accomplish.

Tuesday, January 15, 2013

Clojure: Expectations Verify Interaction Args

The expectations framework provides the ability to create interaction (or behavior) based tests. I've previously written about adding interaction based testing to expectations; however, the examples from that blog entry focused exclusively on testing interactions where each argument is matched using equality. In this entry I'll give examples of how each argument can be also be verified using a class, regex, exception, or a custom function.

When writing state based tests using expectations the type of test you're writing is inferred from the expected value. If the expected value is a regex, expectations will test the actual value to see if it matches the regex. If you passed in a class, expectations will test the actual value to see if it's an instance of that class. If you passed in an exception... you get the idea. All of what I said above, is also true for arguments of an interaction.

Let's start with a simple interaction based test:

In the example above, we're calling the spit function with exactly the arguments that we've specified in our test. This test will pass; however, we've had to specify the exact file location and the exact data. If for some reason you can't specify exactly what the argument will be, it's nice to have a way to specify as much as you possibly can.

In the example below, we're still specifying the exact data, but we're only verifying that the file is somewhere in /tmp/.

As I previously mentioned, we can also get more general and only verify the class of an argument. For example, if we knew our data was going to be a String, but we didn't want to specify exactly what that string was, the following test would do the trick.

While expectations provides you with a lot of default options, there are times when you'll want to write your own argument "matcher". As a contrived example, let's pretend that we want to test that the last argument is true or nil.

One of the best features of expectations is it's error reporting, and the same error reporting logic is applied to arguments when an interaction based test fails. Given the example above, you'll get the following error message.

failure in (success_examples.clj:204) : success.success-examples
           expected: (spit #"/tmp/" String :append true-or-nil?) 
                got: 0 times 

           -- got: (spit "/tmp/somewhere-else" "nil")
           "nil", "/tmp/somewhere-else" are in actual, but not in expected
           true_or_nil_QMARK, #"/tmp/", :append, String are in expected, but not in actual
           expected is larger than actual 

           -- got: (spit "/tmp/hello-world" "some data" :append "s")
           - arg4: not true or nil

As you can see both calls are reported, and each argument has a detailed report (if it did not match).

Finally, expectations provides and additional function that can be used to verify that certain key/value pairs are in an argument. The following example doesn't really make sense, since you'd never want to pass a map as the last argument to spit, but it's easy to follow in the context of this blog entry.

In the above example, (contains-kvs) is used to verify that the final argument to spit contains the key/value pairs :a :b :c :d.

I hope that interaction arg matching follows the principle of least surprise, since it behaves the same as expectations state based tests. I also hope that the ability to use an arbitrary function for verification will provide any necessary flexibility. If you're using expectations, give it a try and let me know.

Tuesday, January 08, 2013

Clojure: Expectations Interactions - Interactions Are Code, Interactions Are Data

If you read my blog you've probably heard "code is data, data is code" and at one time and you've looked up homoiconicity. You may have deeply understood the idea the first time you heard it; I definitely did not. However, a recent addition to expectations opened my eyes to how truly powerful this programming language property can be.

I'll start by admitting what I heard when I originally encountered homoiconicity. Stuart Halloway had begun promoting Clojure, and homoiconicity was one of the advantages he noted. I hit the wikipedia page, digested the words "code is data, data is code", and thought to myself: well, yeah, obviously. I'd spent plenty of time working with DSLs in Ruby, and I had plenty of experience evaluating code in various contexts. I thought something along the lines of: So you capture the code as data and evaluate it wherever it makes sense, I don't see the big deal. In short, I didn't get it.

Fast forward a few years and several hours of full time Clojure development and you'll find me adding interaction based testing to expectations. What I had in mind for testing interactions was simple, I want to write exactly the same thing for the test as what I write for the production code. Additionally, I want the format of the test to follow the same format that is used for state based testing: (expect expected actual)

Once I had a clear vision for my requirements, the format of the tests became easy to visualize. Assume I have a function that prints to standard out, and I want to test that this print occurs.

The above test looks great, but (println 5) will be evaluated, return nil, and use nil as the expected value. I needed some way for the programmer to tell the testing framework that this was an interaction test, and expectations needed to verify that the function was called with the specified parameters. After trying a few different formats, I settled on the following solution.

By wrapping the interaction I wanted to test with (interaction ...), I created an easy way to identify and capture the function and arguments that needed to be verified.

Once I'd decided on the syntax, I went about the task of adding support to expectations. If you dug into the implementation of expectations, you'd find that expect is a macro that delegates the handling of the "expected" and "actual" arguments to the doexpect macro. The first thing the doexpect macro does is check if expected is a list and (if so) if the first argument is the symbol "interaction" (source here). If the first argument is not a list that begins with 'interaction, then the data is passed to do-value-expect and expanded more or less as is. However, if the first argument is a list that begins with 'interaction, then the data is passed to do-interaction-expect, and do-interaction-expect then destructures the data, grabbing only the pieces of the list that it cares about (source here). When I wrote this code, I found it very interesting.

When I envisioned the interaction syntax, I assumed that (interaction ...) would be a call to a macro, and I would need to need to manipulate the data passed to interaction. However, once I got into the actual implementation, I found myself using the symbol "interaction", but never actually defining a macro or even a function. That's when homoiconicity really started to become clear to me. I'd written code that I was sure would need an implementation, yet it was used exclusively as data.

If you kept digging into this example you would find that anything found within (interaction ...) is never used as written, but is instead expanded in a way that allows expectations to rebind the specified function and use the expected arguments at verification time. As a result, you write the same code in the same way but within your test it's used exclusively as data and in your production code it's used exclusively as code. I'm a big fan of convention, and there's no better convention than 'use the exact same thing'.

I later added the ability to add interaction tests for calls to Java objects as well, which led to the following behavior for expectations.

If your expected value is not an interaction, it will be expanded as is.
If your expected value is an interaction with a Clojure function, it will be used as data exclusively and expanded to rebind the function, capture all calls to the function and verify that a call occurred with the arguments you specified.
If your expected value is an interaction with a Java method, it wil be used as data exclusively and expanded to mockito setup and verification code.

Thus, an expected value is sometimes code, and sometimes data.

Wednesday, November 14, 2012

Clojure: Converting scenarios With Interleaved expect Calls To Bare expectations

Since I've deprecated scenarios, I went through all of my projects and removed any usages of expectations.scenarios. For the most part the conversion was simple; however, I did run into one instance where the scenario contained interleaved expectations.

The following code is an example of a scenario with interleaved expectations.

In the previously linked blog entry I recommend using a clojure assert to replace the interleaved expectations. That solution works, but I found an additional approach that I wanted to share.

When I encountered code similar in structure to the code above, I immediately envisioned writing 3 expectations similar to what you find below.

note: for my contrived example the first two tests could have been written without the let; however, the tests from my codebase could not - and I believe the blog entry is easier to follow if the tests are written in the way above.

While these tests verify the same expectations, the way that they are written doesn't convey to a test maintainer that they relate to each other more than they are related to the other tests within the file. While pondering this complaint, I grouped the tests in the following way more as a joke than anything else.

I would never actually use given simply to group code; however, grouping the code together did cause me to notice that there was a usage of given that would not only keep the code grouped, but it would also allow me to test what I needed with less code.

The following example is very similar in structure to the finished product within my codebase.

The above example verifies everything that the original scenario verified, does not use a scenario, and conveys to a maintainer that related logic is being tested within all three tests - in short: this felt like the right solution.

Tuesday, November 06, 2012

Clojure: Deprecating expectations.scenarios

I previously mentioned:

The functionality in expectations.scenarios was borne out of compromise. I found certain scenarios I wanted to test, but I wasn't sure how to easily test them using what was already available in (bare) expectations. The solution was to add expectations.scenarios, and experiment with various features that make testing as easy as possible.

Truthfully, I've never liked scenarios - I've always viewed them as a necessary evil. First of all, I hate that you can't mix them with bare expectations - this leads to having 2 files or 2 namespaces in 1 file (or you put everything in a scenario, meh). You either can't see all of your tests at the same time (2 files), or you run the risk of your tests not working correctly with other tools (expectations-mode doesn't like having both namespaces in 1 file). Secondly, I think they lead to sloppy tests.

The second complaint causes me to get on my soap-box about test writing, but never motivated me to do anything. However, as expectations-mode has become more integral to my workflow, the first issue caused me to make a change.

As of 1.4.17 you should be able to write anything that you would usually write in a scenario in a bare expect instead.

I've already published several blog entries that should help if you're interested in migrating your scenarios to bare expectations.

One feature that is noticeably missing from bare expectations is the stubbing macro. I decided to leave the stubbing macro out as I believe it's just as intention revealing to use with-redefs & constantly, and I always prefer to use core functions when possible.

If you were previously using stubbing, your test can be converted in the following way.

(stubbing [a-fn true]
  (do-work))

;;; can now be written as
(with-redefs [a-fn (constantly true)]
  (do-work))

A nice side effect of removing stubbing is the reduction of indention if you are using both stubbing and with-redefs. This seems like the right trade-off for me (less indenting, relying on core functions that everyone should know); however, I'm not against adding stubbing again in the future if it becomes a painfully missing feature.

There is one type of scenario that I haven't yet addressed, interleaved expectations. I found zero of these types of scenarios in my codebases; however, I'm addressing these types of scenarios here for completeness.

(scenario
  (do-work)
  (expect a b)
  (do-more-work)
  (expect c d))

Any scenario that has interleaved expectations can be converted in the following way:

(expect c
  (do
    (do-work)
    (assert (= a b))
    (do-more-work)
    d))

expectations 1.4.17 still has support for scenarios, so you can upgrade and migrate at your own pace. I'll likely leave scenarios in until the point that I change some code that breaks them, then I'll remove them. Of course, if you prefer scenarios, you're welcome to never upgrade, or fork expectations.

If you run into issues while converting your scenarios, please open an issue on github: https://github.com/jaycfields/expectations/issues?state=open

Monday, November 05, 2012

Clojure: Using given & expect To Replace scenarios

The functionality in expectations.scenarios was borne out of compromise. I found certain scenarios I wanted to test, but I wasn't sure how to easily test them using what was already available in (bare) expectations. The solution was to add expectations.scenarios, and experiment with various features that make testing as easy as possible.

Two years later, the features that make sense have migrated back to expectations:

With those features, you should be able to convert any existing scenario to a bare expectation. What isn't covered with those features is what you should do if your scenario ends with multiple expects. This blog entry demonstrates how you can use given with a bare expectation to achieve the same test coverage.

Below is an example of a scenario that ends with multiple expects.

Using given, these scenarios are actually very easy to convert. The given + bare expectation example below tests exactly the same logic.

The test coverage is the same in the second example, but it is important to note that the let will now be executed 3 times instead of 1. This isn't an issue if your tests run quickly, if they don't you may want to revisit the test to determine if it can be written in a different way.

An interesting side-effect occurred while I was converting my scenarios - I found that some of my scenarios could be broken into multiple expectations that were then easier to read and maintain.

For example, the above expectations could be written as the example below.

note: you could simplify even further and remove the given, but that's likely only due to how contrived the test is. Still, the possibility exists that some scenarios will be easily convertible to bare expectations.

Using the technique described here, I've created bare expectations for all of the scenarios in the codebase I'm currently working on - and deleted all references to expectations.scenarios.

Thursday, November 01, 2012

Clojure: Use expect-let To Share A Value Between expected And actual

Most of the time you can easily divorce the values needed in an expected form and an actual form of an expectation. In those cases, nothing needs to be shared and your test can use a simple bare expect. However, there are times when you need the same value in both the expected and actual forms - and a bare expect doesn't easily provide with a way to accomplish that.

In version 1.4.16 or higher of expectations, you can now use the expect-let macro to let one or more values and reference them in both the expected and actual forms.

Below is a simple example that makes use of expect-let to compare two maps that both have a DateTime.

If possible you should prefer expect, but expect-let gives you another option for the rare cases where you absolutely need to share a value.

Clojure: Freezing Time Added To expectations

If you're using expectations and Joda Time, you now have the ability to freeze time in bare expectations (version 1.4.16 and above). The following code demonstrates how you can use the freeze-time macro to set the time, verify anything you need, and allow time to be reset for you.

Under the covers freeze-time is setting the current millis using the DateTime you specify, running your code and resetting the current millis in a finally. As a result, after your code finishes executing, even if finishing involves throwing an exception, the millis of Joda Time will be set back to working as you'd expect.

The freeze-time macro can be used in both the expected and actual forms, and can be nested if you need to set the time multiple times within a single expectation.

Clojure: Interaction Based Testing Added To expectations

The vast majority of testing I do these days is state-based; however, there are times when I need to test an interaction (e.g. writing to a file or printing to standard out). The ability to test interactions has been in expectations.scenarios for quite awhile, but there isn't any reason that you need a scenario to test an interaction - so, as of version 1.4.16, you also have the ability to test interactions with bare expectations.

The following test shows how you can specify an expected interaction. This test passes.

Writing the test should be straightforward - expect the interaction and then call the code that causes the interaction to happen.

As I was adding this behavior I enhanced the error reporting. Below you can find a failing test and the output that is produced.

As you can see, all three calls to the 'one' function are reported. If the number of args used to call 'one' are of the same size as the expected args, each arg is compared in detail; otherwise the two lists are compared in detail (but the elements are not).

As you can see in this failure the first argument, "hello", matches.

 ;          got: (one "hello" {2 3, :a 1})
 ;                   arg1: matches
 ;          expected arg2: {:a :b, :c {:ff :gg, :dd :ee}}
 ;            actual arg2: {2 3, :a 1}
 ;          2 with val 3 is in actual, but not in expected
 ;          :c {:dd with val :ee is in expected, but not in actual
 ;          :c {:ff with val :gg is in expected, but not in actual
 ;          :a expected: :b
 ;                  was: 1

Anytime an argument matches expectations will simply print "matches". You can also specify :anything as an argument, to ignore that argument and always 'match'. The following test shows an example of matching the second argument, while the first argument is no longer matching.

That's it. Hopefully these interaction tests follow the principle of least surprise, and are easy for everyone to use.

Wednesday, October 31, 2012

Clojure: redef-state Added To expectations

When testing functions that reference some state (atom, ref, or agent), it's nice to be able to quickly replace the value of the state in the context of the test. When your function only interacts with one piece of state, a simple call to with-redefs will do the trick. However, there are times when the function that you're calling updates many different pieces of state, and you'd like to be able to redef all of them with one call. The expectations testing framework (v 1.4.16 and above) provides you the ability to redef all atoms, refs, and agents in a namespace with one call to redef-state.

(this same feature existed in expectation.scenarios as 'localize-state')

Let's take a look at the following contrived namespace

In the above namespace we have two atoms that are both updated when you process an update. Testing that the atoms are updated is fairly simple, which the tests below demonstrate.

Unfortunately, these tests will not both pass, as they both update the same atom. We could clean up at the end of each test, but it's usually cleaner to simply redef the atoms in the context of the test. The tests below use with-redefs to ensure that the state is only manipulated in the context of the tests.

At this point the tests all pass. This solution works fine, but expectations gives you the ability to trim a bit of code and simply specify the namespace instead. The following tests specify the namespace and let expectations take care of the rest.

That's it. Now all atoms, refs, and agents that are defined in the 'blog' namespace will be redefined within the context of the (redef-state) call. It's also important to note that redef-state can take as many namespaces as you'd like to specify in the first arg vector.

Wednesday, October 03, 2012

clojure: lein tar

A co-worker recently asked how I package and deploy my clojure code. There's nothing special about the code, but I'm making it available here for anyone who wants to cut and paste. Deploy is the easy part - scp a tar to the prod box. Building the tar is very easy as well. I've run this on a few different linux distros without issue, but YMMV. Without further ado.

I'm sure there are easier ways, and I know I could do it programically - but this works and is easy to maintain. That's good enough for me.