Jay Fields' Thoughts: 2013

Tuesday, June 11, 2013

Coding: Increase Your Reading and Writing Speed

A teammate of mine recently expressed a desire for a shortcut for something we type often. I started looking into our shortcut options and came to a common determination: We can do this, but the number of 2 key shortcuts available to us is finite, so we better use them wisely.

I wrote the following unix to give me a rough idea of what we type frequently.

find . -name "*.clj" | xargs cat | tr -s '[:space:]:#()[]{}\"' '\n' | sort | uniq -c | sort -n

note: If you're not writing clojure you'll want to look for something other than .clj files, and you might also want to tweak what you replace with a new line.

The above unix gave me an ordered list of the most typed 'words' across all of my codebases. At this point I had some science for setting up some shortcuts.

Writing

You'll want to look into whatever editor/ide you use and see if you can find key shortcuts and snippet expansion. My editor is emacs; I assigned some key-chords and some yasnippets. If you're not using emacs you should have something similar in whatever you are using.

While I wanted to define some shortcuts, I also didn't want to create so many that I was constantly wasting time looking up what I'd created. Based on that desire I created:

2 shortcuts (key-chords) for two of the most duplicated words. The shortcuts are concise by design, but that makes them a bit harder to remember. You can probably get started with more than 2, but I didn't see much harm in starting there.
a dozen snippets for the next most used words. These snippets are descriptive enough to easily remember, thus I felt comfortable defining several of them. e.g. pps expands to (println (pr-str )).

Having shortcuts and snippets will obviously make me more productive, and the unix helped me figure out which words were the most important to optimize for.

Reading

Most editors/ides also give you a summary view for common code patterns. For example, IntelliJ displays lambdas when the actual code is actually an anonymous class. Emacs gives you font-lock for turning patterns into individual characters. Armed with my list of the most common words in my codebase, I created font-locks for the 11 most duplicated.

Sometimes a picture is worth a thousand words. Below is a function definition without any font locks applied.

The following image is what the same function looks like with custom font locks applied. It might be a bit jarring at first, but in the long run it should add up to many small victories such as quickly identifying patterns in code and less line breaks.

Results

I've been working with these settings for a little over a week now. I haven't needed to look up anything I defined, and I get a little burst of satisfaction when I read or write something faster than I'd been able to in the past. I'd definitely recommend doing something similar with your codebase and ide/editor.

Thursday, May 16, 2013

Clojure: Combining Calls To Doseq And Let

I've you've ever looked at the docs for clojure's for macro, then you probably know about the :let, :when, and :while modifiers. What you may not know is that those same modifiers are available in doseq.

I was recently working with some code that had the following form.

Upon seeing this code, John Hume asked if I preferred it to a single doseq with multiple bindings. He sent over an example that looked similar to the following example.

That was actually the first time that I'd seen multiple bindings in a doseq, and my immediate reaction was that I preferred the explicit simplicity of having multiple doseqs. However, I always have a preference for concise code, and I forced myself to starting using multiple bindings instead of multiple doseqs - and, unsurprisingly, I now prefer multiple bindings to multiple doseqs.

You might have noticed that the second version of the code slightly changes what's actually being done. In the original version the 'name' function is called once per 'id', and in the second version the 'name' function is called once per 'sub-id'. Calling name significantly more often isn't likely to have much impact on your program; however, if you were calling a more expensive function this change could have a negative impact. Luckily, (as I previously mentioned) doseq also provides support for :let.

The second example can be evolved to the following code - which also demonstrates that the let is only evaluated once per iteration.

That's really the final version of the original code, but you can alter it slightly for experimentation purposes if you'd like. Let's assume we have another function we're calling in an additional let and it's expensive, it would be nice if that only occurred when an iteration was going to happen. It turns out, that's exactly what happens.

Whether you prefer multiple bindings or multiple doseqs, it's probably a good idea to get comfortable reading both.

Wednesday, May 15, 2013

Emacs Lisp: Font Lock for Clojure's Partial

I love using partial, but I dislike the length of the function name. There's a simple solution, define another function with a shorter name that simply calls (or is) partial. This is exactly what I did in the jry library.

I liked the use of % due to partial feeling similar to creating a function using #(), and % having a special meaning inside #(). I thought they tied well together. Unfortunately, there's an obvious problem, things would be very broken if you tried to use the '%' function in an anonymous function defined with #(). Somewhere along the way this issue caused me to stop using jry/%.

Using partial is great: it's part of the standard lib, and I don't need to explain it to anyone who joins my team or any future maintainers of the code I write. Still, I want something shorter, and I've always had a background thread looking for another shorter-than-partial solution. While recently contributing to emacs-live I found the solution I was looking for: clojure-mode font lock.

The following code can now be found in my emacs configuration.

This solution feels like the best of both worlds. My code still uses the function from the standard library, my colleagues still see a function they already know, and 'partial' only takes up one character space in my buffer. The image below is what you'll see if you put the above emacs-lisp in your config.

Tuesday, May 14, 2013

Clojure: Testing The Creation Of A Partial Function

I recently refactored some code that takes longs from two different sources to compute one value. The code originally stored the longs and called a function when all of the data arrived. The refactored version partials the data while it's incomplete and executes the partial'd function when all of the data is available. Below is a contrived example of what I'm taking about.

Let's pretend we need a function that will allow us to check whether or not another drink would make us legally drunk in New York City.

The code below stores the current bac and uses the value when legally-drunk? is called.

The following (passing) tests demonstrate that everything works as expected.

This code works without issue, but can also be refactored to store a partial'd function instead of the bac value. Why you would want to do such a thing is outside of the scope of this post, so we'll just assume this is a good refactoring. The code below no longer stores the bac value, and instead stores the pure-legally-drunk? function partial'd with the bac value.

Two of the three of the tests don't change; however, the test that was verifying the state is now broken.

note: The test output has been trimmed and reformatted to avoid horizontal scrolling.

In the output you can see that the test is failing as you'd expect, due to the change in what we're storing. What's broken is obvious, but there's not an obvious solution. Assuming you still want this state based test, how do you verify that you've partial'd the right function with the right value?

The solution is simple, but a bit tricky. As long as you don't find the redef too magical, the following solution allows you to easily verify the function that's being partial'd as well as the arguments.

Those tests all pass, and should provide security that the legally-drunk? and update-bac functions are sufficiently tested. The pure-legally-drunk? function still needs to be tested, but that should be easy since it's a pure function.

Would you want this kind of test? I think that becomes a matter of context and personal preference. Given the various paths through the code the following tests should provide complete coverage.

The above tests make no assumptions about the implementation - they actually pass whether you :use the 'original namespace or the 'refactored namespace. Conversely, the following tests verify each function in isolation and a few of them are very much tied to the implementation.

Both sets of tests would give me confidence that the code works as expected, so choosing which tests to use would become a matter of maintenance cost. I don't think there's anything special about these examples; I think they offer the traditional trade-offs between higher and lower level tests. A specific trade-off that stands out to me is identifying defect localization versus having to update the test when you update the code.

As I mentioned previously, the high-level-expectations work for both the 'original and the 'refactored namespaces. Being able to change the implementation without having to change the test is obviously an advantage of the high level tests. However, when things go wrong, the lower level tests provide better feedback for targeting the issue.

The following code is exactly the same as the code in refactored.clj, except it has a 1 character typo. (it's not necessary to spot the typo, the test output below will show you want it is)

The high level tests give us the following feedback.

failure in (high_level_expectations.clj:14) : expectations.high-level-expectations
(expect
 true
 (with-redefs
  [state (atom {})]
  (update-bac 0.01)
  (legally-drunk? 0.07)))

           expected: true 
                was: false

There's not much in that failure report to point us in the right direction. The unit-level-expectations provide significantly more information, and the details that should make it immediately obvious where the typo is.

failure in (unit_level_expectations.clj:8) : expectations.unit-level-expectations
(expect
 {:legally-drunk?* [pure-legally-drunk? 0.04]}
 (with-redefs [state (atom {}) partial vector] (update-bac 0.04)))

           expected: {:legally-drunk?* [# 0.04]} 
                was: {:legally-drunk?** [# 0.04]}
 
           :legally-drunk?** with val [# 0.04] 
                             is in actual, but not in expected
           :legally-drunk?* with val [# 0.04] 
                            is in expected, but not in actual

The above output points us directly to the extra asterisk in update-bac that caused the failure.

Still, I couldn't honestly tell you which of the above tests that I prefer. This specific example provides a situation where I think you could convincingly argue for either set of tests. However, as the code evolved I would likely choose one path or the other based on:

how much 'setup' is required for always using high-level tests?
how hard is it to guarantee integration using primarily unit-level tests?

In our examples the high level tests require redef'ing one bit of state. If that grew to a few pieces of state and/or a large increase in the complexity of the state, then I may be forced to move towards more unit-level tests. A rule of thumb I use: If a significant amount of the code within a test is setting up the test context, there's probably a smaller function and a set of associated tests waiting to be extracted.

By definition, the unit-level tests don't test the integration of the various functions. When I'm using unit-level tests, I'll often test the various code paths at the unit level and then have a happy-path high-level test that verifies integration of the various functions. My desire to have more high-level tests increases as the integration complexity increases, and at some point it makes sense to simply convert all of the tests to high-level tests.

If you constantly re-evaluate which tests will be more appropriate and switch when necessary, you'll definitely come out ahead in the long run.

Tuesday, May 07, 2013

Recovering Lost Post Data

I recently typed out a long, thoughtful response in a textarea. I clicked submit, like I've done millions of times, and I got the dreaded "session expired" error message. This happens very, very rarely, but it's devastating when it does. Creating long & thoughtful responses isn't something that comes naturally for me. I crossed my fingers and clicked back. No luck, web 2.0 dynamically created text boxes ensured Chrome had no chance to preserve my editing state.

My first reaction was: I guess I'm not responding after all. Then it occurred to me, DevTools must have my data somewhere, right? Lucky for me, the answer was yes.

There might be easier ways, this is what worked for me:

open DevTools
go to the "Network" tab.
look for the row with the method POST.
- If you don't see a POST row, try refreshing the page. With any luck you'll get a repost confirmation dialog, giving you some hope that your data is still around. (You'll want to allow the data repost)
click on the POST row, and scroll down till you see "Form Data". If you've gotten this far, hopefully you'll find your data in clear text and able to be copied.

The examples from this post are from following the instructions above and logging in to twitter.com. If you've ever lost post data in the past, you may want to give these directions a dry-run now.

Thursday, May 02, 2013

Emacs Lisp: Toggle Between a Clojure String and Keyword

When I was doing a fair bit of Ruby I often used the TextMate's shortcut (Ctrl+:) to convert a Ruby String to a Symbol or a Ruby Symbol to a String. It's something I've periodically missed while doing Clojure, and yesterday I found myself in the middle of a refactoring that was going to force the conversion of 5+ Clojure Keywords to Strings.

The following emacs lisp is my solution for toggling between Clojure Strings and Keywords. The standard disclaimers apply - it works on my machine, and I've never claimed to know emacs lisp well.

A quick video of the behavior:

Tuesday, April 30, 2013

Year Five

The average lifespan for a software engineering job is 4 years. Okay, I've never actually seen proof (or contradiction), but that's the general feeling in the groups I associate with. Perhaps that's selection bias - my employer has generally changed on year 3 or 4. Perhaps this is the exception and not the rule, in that case feel free to simply read this as an experience report. However, I do think it's somewhat common for developers to leave around year 3 or 4. This entry contains speculation on why they leave, and offers one idea on what employers can do to break that cycle.

My 4 year employment cycle generally looks like this

Year One: "I'm in over my head. My semi-bluff was in-fact a bluff. They're going to fire me any day."
Year Two: "It's nice to feel like a productive team member"
Year Three: "This is fun, and I'm not bad at it. It's satisfying to pass on knowledge to teammates."
Year Four: "This feels repetitive, that grass over there sure looks greener"

I expect that I, like many programmers, probably undervalue my contribution in the early days and overvalue my contribution in the latter days.

In Year Three and Four at DRW I spent some time thinking about how I felt, and observing the behavior of some colleagues that were also on year three and four. A few things stood out to me.

A company you don't work at always seems to have infinite possibilities; however, after a few years with an employer, it's extremely clear what your options are. More importantly, it's very clear what limitations will likely always be there.
A company you don't work at contains no code you're responsible for. Conversely, any company you've been with for 4 years probably has plenty of code you're not proud of. If you're responsible for that code, it's a constant reminder of your previous limitations. If you're not responsible for it, your co-workers aren't likely to let you forget about it anytime soon.
There's always someone willing to pay you more than you're worth. After several years with a company it's likely that they're going to pay you what you're worth, but not what some other company thinks you're worth. I'm surprised that more companies don't pay (the employees they want to keep) what their "flawless market value" would be. In other words, what would you pay them if they interviewed, you determined what they knew, you determined what value they would bring, and you were completely ignorant of their flaws? That's what your competition is likely doing. That's what you're fighting against if you want to keep them around.
A new job often offers a new challenge. Once you feel like you've given that challenge your best shot, what remains? If you did a great job, it's likely that you'll have plenty of other options. However, if you've done a good job, you may be stuck in a spot where there aren't as many open doors and challenges to choose from - not nearly as many as a position at another company will appear to offer.

I was recently in Punta Cana for wedding, and I was on the beach - working on my laptop. My wife asked: don't you want some time off? My response was short and immediate: no. Later that evening my wife and I discussed my work situation. I observed that I'm in Year Five at DRW and I'm happy, happier than ever, strange - given my previous experiences. She asked if I thought that I was working too much, and if I thought that I would burn out. I remarked: I'd rather have a job that I love, that I don't like to be away from, than a job where I feel like I need a week or two off.

I hear you, nice work if you can get it. I don't have a general recipe for getting there, but I know how I got there.

Back in 2009 I interviewed at DRW. At the time I was working for ThoughtWorks, and my client was Forward. I considered the founder of Forward to be a friend and someone I would gladly work for. I decided it was time to leave ThoughtWorks (after 3.5 years), and I was sure that Forward would be my future home. I remarked to my DRW recruiter "H" (who also happened to be a friend from my ThoughtWorks days) that one of the best things about Forward was knowing that I liked and trusted the man who ran Forward. H said nothing, but made a brilliant move.

In my interview I was grilled, killed even, and then things turned. I met with a guy who asked me a few questions and then told me about the company: the vision, the people, and where I could fit in. He was smart, easy to talk to, and someone I related to. We discussed things casually, it didn't feel like a company pitch in any way at all, it felt like small-talk - something I was very grateful for after the beating I'd taken previously in of the day. After everything concluded I hit the bar with my friends, including H. At that point they revealed to me that the guy I'd met was the partner at the firm that was (among other things) responsible for the firm's technology. I'd also met the CTO, and various other people responsible for technology in the firm. H had shown me that DRW, just like Forward, had what I like to call Awesome All the Way Up.**

Awesome All the Way Up has served me very well at DRW. To this day I remain in fairly common contact with the CTO and several of DRW's partners. About 6 months ago I asked 3 favors. First of all, I asked for enough money to pay someone's salary for 6 months. I identified a project that I wanted to undertake, and I needed help to complete it. Then things got unconventional, I asked if I could create a contract-to-hire situation. Even more unconventional, I pursued a friend and previous colleague who lived in Austin, Texas. DRW rarely uses contractors, and has no other remote employees that I'm aware of. An appropriate amount of questions were asked, but in the end my request was granted.

The experiment is on-going, but I'm very happy with our progress so far. That's all well-and-good, but the support of DRW is the important aspect of the story. I'm confident that their support of my unconventional requests was a major factor in ensuring my happiness in Year Five. We recently hired John Hume, thus declaring success at some level already. However, if things had gone poorly, both parties could have gone their separate ways with little lost and lessons learned. More importantly to me, DRW would have continued to give me confidence that they were willing to take chances to provide me with opportunities and ensure my continued happiness at the firm.

There's a similar discussion around DRW allowing me to use Clojure as my primary development language. I'll spare you the long version. tl; dr: They gave me a reasonable amount of space to try something new, and supported me appropriately as we found more and more success.

Not all of my experiments are green-lighted, and I've also had unsuccessful outcomes. DRW has done a good job of not setting me up to fail; my ideas that have a low probability of succeeding are fleshed out and appropriately shot down. All experiments have risk measures put in place, limited downside, and are reassessed constantly. It's great to have support when things are going well, and it's essential to have support when things don't go as planned.

For me, that's been the secret for keeping me around more than 4 years: An appropriate amount of trust and a willingness to experiment.

A foreign thought also recently came to mind. For the first time in my life I can say that I see myself happy and successful at my current employer in 10 years. This is a question I've asked many people since it occurred to me. To date, +AdeOshineye (http://www.oshineye.com/) is the only person who's responded affirmatively. The results aren't surprising to me, but I do wonder why more employees and employers aren't looking for ways to extend relationships.

Perhaps the secret for keeping me around isn't more broadly applicable; however, simply asking what will keep an individual around is probably the more important message in this entry. It's good to know what will make someone happy now, but it seems like it's equally important to know what will make them happy in the long term. I suspect the answers will be at least a little, if not very different.

The way things currently stand, I'm looking forward to writing about Year Six.

** DRW became my home in the end; however, Forward continues to do well. I suspect Awesome All the Way Up would have ensured happy and gainful employment at either destination. I remain in regular contact with my friends at Forward.

Monday, April 08, 2013

Clojure: Expectations, Customize your Test Running Context

I've previously written about expectations' before run hook and the built in support for detecting state changes. These features are nice for reassigning vars you don't care about or detecting accidental state changes; however, there was no option for temporarily changing the context in which your tests are run. As of expectations 1.4.36, that's no longer true.

Version 1.4.36 allows you to alter the context in which your tests run by creating a function that takes the "run the tests" function as an arg, and do you as wish.

Too abstract, no problem, the code:

That's it, if you add the previous function to your tests, you can define whatever you like in the in-context function, and your tests will be run in your custom context. The previous snippet is pulled from the expectations tests, which verify that success.success-examples-src/a-fn-to-be-rebound is correctly rebound. In the expectations tests an expectations_options.clj namespace is used to define in-context (expectations_options.clj explained here).

Why would you want to define your own context? In the past I've used in-context for two reasons: rebinding things I don't care about (e.g. calls to logging) to (constantly true), and rebinding things I never want called (e.g. create a new thread) to #(throw (RuntimeException. "not allowed in tests"))

I've been using in-context for about a month now, and it's worked well for me. Give it a shot and let me know how it's working for you.

Wednesday, April 03, 2013

Being a Lead Consultant

These are not my ideas, they were given to me. I wanted to store them in a safe place, which is why they can be found here. Thank you for your wisdom Scott and Sean.

Voice of authority to the client on technical matters, meaning that I'm looking for someone who can establish their own credibility with senior client staff, including non-technical managers. This means knowing what the client values and being able to communicate the important issues, not hung up on trivial issues.
Leading developers by example. Other developers, particularly junior developers, will pick up and mimic behaviors of their lead developer in ways that may not even be obvious to either one of them. I look for the senior devs to understand this and model the behavior that want to see from their team.
Understanding that it's not just about coding cards. This means making sure the relationship between the development team and the BAs and testers is fruitful and productive. Obviously, the PM and IM also have a role to play here but the senior developers are even more influential with other developers.
Herding the technical team. This means being the guy to make sure the unfun cards get done, everyone responds to broken builds, and that there is a sense of signing up to and committing to work. A lot of this can come from the leadership by example but there is also a vocal component that I'm talking about here. Not being a cheerleader but willing and able to facilitate decision making in groups and address issues one-on-one as appropriate.
Confidence-by-proxy. That is, your teams just feels safer/better/stronger when you're there, vs when you're not.
Escalation conduit. Your team naturally pulls you into their trouble, or invites your contributions to their challenges/achievements
Professional reflection. Your team is eager to hear your thoughts on their individual capabilities, reach.
An interesting indicator is often the 'are they talking about what I think when I'm not there' test. That is, do they find themselves bringing your ideas/opinions/objectives into conversations, even when you're not present. Kind of a fun 'tell' of a powerful influencer/leader.

Tuesday, April 02, 2013

Emacs Lisp: Find Java Sources

Confession: I really hope someone can tell me I'm doing this wrong. I can't believe there isn't an easier way.

I work with Clojure, in Emacs, almost every day. Navigating the source is usually fairly easy. If I want to navigate to a function definition, all I need to press is M-., and if I want to navigate back, M-, does the trick. This works for Clojure that I've written, as well as Clojure that lives in the libraries that I reference. That's fine the vast majority of the time, but occasionally I need to navigate to the Java source of some library I'm using. This is where I can't believe that no one else has solved this problem.* If I'm in a clj file, my cursor is on a Java class, I don't know of any way to easily navigate to the class definition.

Context: I use fig for dependency management (at work). I have my projects set to put sources in ./the-project/lib/sources; therefore, the following solution assumes the sources are in that directory. If you use Lein (for deps), I'm sure the sources are on your local drive somewhere. All you need to do is change the lisp below to point to your source dir.

Additionally, I use Lein, so I have a project.clj at the root of my project; however, there's nothing special or required about "project.clj". You could just as easily put a this.is.a.project.root file in the root of your project, and search for that file.

The following code will search the your source jars for a Java class, and open the first match that it finds (or no match, if no match is found)

disclaimer, I'm still an emacs lisp beginner.

The previous code is pretty straightforward. Line 3 uses expand region to mark whatever Java class my cursor is currently on or near. You could type the word instead if you like, this page should help you understand how.
Line 4 and 6 find and verify where the project root lives.
Line 8 (and 9) greps for a string (the Java class) in a directory (my source dir).
Line 10 switches to the grep results.
Line 11 sleeps, waiting for the grep results to return.
Line 12 searches forward, looking for the first match.
Line 13 opens the jar of the first match.
Line 14 assigns the current point to a var named 'current-point'.
Line 15 searches forward to the end of the jar name.
Line 16 grabs the name of the jar and switches to that buffer.
Line 17 searches the jar's dired buffer for the name of the class.
Line 18 opens the first class name match.
(Line 19 lets you know if your project root could not be determined)

That's it, you can now easily find java source (with the occasional conflicting name annoyance). It's not pretty, but it gets the job done.

* I've been told that a few Java modes are good, if I can easily use those to navigate from my Clojure to Java, please leave a link to a manual I can dig into. I assume there are etags solutions, but it's not clear what the best way to go is. I'm sure there's an easy solution for navigating Java from Clojure, I'm just having a hard time finding it.

Wednesday, March 20, 2013

Clojure: Expectations Ignore A Variable Number Of Args In An Interaction Test

Over the past year I've written the same test a few times.

This test accomplishes what I'm looking for when I write it - verification that my-fn isn't called. However, it doesn't prevent me from future regressions where my-fn is called with 0, 2, or 2+ arguments. After being bitten by this issue a few times I decided to add an argument matching function that will accept any value for the argument at it's index, and any value for all of the arguments at a greater index. This argument matching function is anything&

The following example is similar to what's above, except any call to my-fn will result in a failure

My original intent was to protect against the scenario I described above*; however, what I ended up with is actually flexible enough to allow me to test other situations as well. For example, I've often found myself testing that a log message was written at a certain level, but not being interested in testing the actual message. The logging I'm working with allows a variable number of arguments, so anything& is perfect for verifying that no matter what args are passed in, the tests passes as long as the level is set correctly.

The anything& matching function gives me an extra bit of flexibility that can come in handy at times.

*which is similar to verifyZeroInteractions in mockito

Tuesday, March 19, 2013

Clojure: Expectations Interaction Tests For Java Objects

I recently ran into some code that forced me to integrate with a Java library. While using the library I found myself wanting to do a bit of interaction testing, which I've historically done with Mockito. As a result, I added the ability to do interaction based tests on mock Java objects, directly in expectations.

Hopefully the code is what you'd expect.

The previous example creates a mock Runnable in an expect-let, expects the run method to be run, and then calls the run method of the mock. This test is worthless in a real world context, but it's the simplest way to demonstrate the syntax for creating a mock & specifying the interaction.

The mock function defined in erajure, a minimal wrapper around mockito. All of the "times" arguments are the same as what's available for function interaction tests, examples can be found here.

Tuesday, February 26, 2013

Synchronizing Snapshots and Incrementals With Single Threading

Code available on: https://github.com/jaycfields/snapshot-incremental-synchronize

Many of the applications that I write these days have a lot of data - so much that there's no reasonable way to continually send all of it. Instead, most of the applications I work with will have the ability to receive a snapshot of the current state, and the ability to receive deltas (incrementals) that must be applied to the previous snapshot. To further complicate things, incomplete data is unacceptable and ordering matters. This type of environment breeds many solutions for synchronizing snapshots and incrementals. This entry is about using single threading (via jetlang) for synchronization and guaranteed accuracy.

Let's take a very simple example, you have two processes a client and server. The server has a list and the client needs to display that list - completely and in order. The list on the client also needs to be updated whenever the list on the server is updated.

There are several issues that you could encounter in a multithreaded environment.

If you request a snapshot and then start listening to incrementals, you may miss data that isn't in the snapshot, but was broadcast before you started listening to incrementals
If you start listening to incrementals and request a snapshot at the same time, you may apply an incremental to the snapshot, even though the snapshot already reflects the incremental.
If you start listening to the incrementals first, you'll need some way to throw away the incrementals that are already reflected in the snapshot.

It's time to get into some code.

Here's some simple server code.

The above code contains a server-list, which is a list that represents the ordered random numbers being generated on the server side. Our task is to mirror this list in our client. The appending scheduled task and appending fiber are stored to allow for easy starting and stopping of appending. The server-start and server-stop functions are provided for convenience, should you choose to run this example locally.

The subscriber atom and the subscribe function are a simple way for a client to subscribe to snapshots and incrementals. The publish-to-client function derefs a fn and immediately calls it with a snapshot or incremental. In a prod application, publish and subscribe logic would probably involve a socket or messaging system - our solution is purposefully naive, to focus on the point of the post: synchronization.

The get-snapshot function publishes the current state of the the server-list to a client. The append-to-list function is removing elements so it's easy to see the server-list changing - without the data growing to an unmanageable size, in prod this would (likely) not exist; however, the rest of the code in append-to-list is fairly representative of a common practice - generate a delta, apply it to the local list and publish it out to clients.

Looking at this code, it's easy to see that one fiber is appending to the list and publishing to the client, while another fiber would return the value of get-snapshot. This code can work, but the way it's currently written data accuracy cannot be guaranteed.

Let's look at some client code.

The client-start function subscribes to server updates, and then requests a snapshot. The handle update function resets a client-list on snapshot and conjs an incremental to the existing list. (note: the client list is kept at 10 elements for simplicity, just like the server - I would not expect this type of code to be in prod).

Below is a full snapshot of the current code.

The client and server code is the same as above, but this example also contains some function calls in a comment. At this point you can paste this code into your favorite editor, start the client and the server and inspect both lists. The update frequency is so large that you can even compare the two lists, and it's highly likely that they are equal.

For a lot of problems this code may be sufficient; however, as we noted above, there is definitely an opportunity for you to see invalid state. With this specific code the append fiber could update the atom with an incremental X, on the main fiber get-snapshot could deref a snapshot with X included (and publish it) and then the append fiber could also publish the incremental X. Luckily there's a simple solution, publish the snapshot, update the server-list, and publish the incrementals all on the same fiber.

The code below shows how easy it is to create a jetlang fiber and execute an anonymous function.

As you can see, very little changed with the code. We've defined another fiber, synchro-fiber, which we will use to single thread our updates to server-list and our publishes to the client. The synchro-fiber will execute the runnables (in our example, anonymous functions) that are put on it's queue, in order. The body of get-snapshot and append-to-list were slightly modified to call the execute function with their previous body as an anonymous function. Other technical differences are also true - the code isn't immediately run, it's no longer blocking, and the return value has been altered. While all of these observations are true, they are irrelevant with respect to what we were trying to accomplish.

Using jetlang fibers we've accomplished our goal - we can guarantee that snapshots and incrementals will be easy to synchronize (without sequence ids), accurate, and in order. Of course, you'll need to consume both of these messages on a single fiber as well, but that should be equally easy to accomplish.

Tuesday, January 15, 2013

Clojure: Expectations Verify Interaction Args

The expectations framework provides the ability to create interaction (or behavior) based tests. I've previously written about adding interaction based testing to expectations; however, the examples from that blog entry focused exclusively on testing interactions where each argument is matched using equality. In this entry I'll give examples of how each argument can be also be verified using a class, regex, exception, or a custom function.

When writing state based tests using expectations the type of test you're writing is inferred from the expected value. If the expected value is a regex, expectations will test the actual value to see if it matches the regex. If you passed in a class, expectations will test the actual value to see if it's an instance of that class. If you passed in an exception... you get the idea. All of what I said above, is also true for arguments of an interaction.

Let's start with a simple interaction based test:

In the example above, we're calling the spit function with exactly the arguments that we've specified in our test. This test will pass; however, we've had to specify the exact file location and the exact data. If for some reason you can't specify exactly what the argument will be, it's nice to have a way to specify as much as you possibly can.

In the example below, we're still specifying the exact data, but we're only verifying that the file is somewhere in /tmp/.

As I previously mentioned, we can also get more general and only verify the class of an argument. For example, if we knew our data was going to be a String, but we didn't want to specify exactly what that string was, the following test would do the trick.

While expectations provides you with a lot of default options, there are times when you'll want to write your own argument "matcher". As a contrived example, let's pretend that we want to test that the last argument is true or nil.

One of the best features of expectations is it's error reporting, and the same error reporting logic is applied to arguments when an interaction based test fails. Given the example above, you'll get the following error message.

failure in (success_examples.clj:204) : success.success-examples
           expected: (spit #"/tmp/" String :append true-or-nil?) 
                got: 0 times 

           -- got: (spit "/tmp/somewhere-else" "nil")
           "nil", "/tmp/somewhere-else" are in actual, but not in expected
           true_or_nil_QMARK, #"/tmp/", :append, String are in expected, but not in actual
           expected is larger than actual 

           -- got: (spit "/tmp/hello-world" "some data" :append "s")
           - arg4: not true or nil

As you can see both calls are reported, and each argument has a detailed report (if it did not match).

Finally, expectations provides and additional function that can be used to verify that certain key/value pairs are in an argument. The following example doesn't really make sense, since you'd never want to pass a map as the last argument to spit, but it's easy to follow in the context of this blog entry.

In the above example, (contains-kvs) is used to verify that the final argument to spit contains the key/value pairs :a :b :c :d.

I hope that interaction arg matching follows the principle of least surprise, since it behaves the same as expectations state based tests. I also hope that the ability to use an arbitrary function for verification will provide any necessary flexibility. If you're using expectations, give it a try and let me know.

Tuesday, January 08, 2013

Clojure: Expectations Interactions - Interactions Are Code, Interactions Are Data

If you read my blog you've probably heard "code is data, data is code" and at one time and you've looked up homoiconicity. You may have deeply understood the idea the first time you heard it; I definitely did not. However, a recent addition to expectations opened my eyes to how truly powerful this programming language property can be.

I'll start by admitting what I heard when I originally encountered homoiconicity. Stuart Halloway had begun promoting Clojure, and homoiconicity was one of the advantages he noted. I hit the wikipedia page, digested the words "code is data, data is code", and thought to myself: well, yeah, obviously. I'd spent plenty of time working with DSLs in Ruby, and I had plenty of experience evaluating code in various contexts. I thought something along the lines of: So you capture the code as data and evaluate it wherever it makes sense, I don't see the big deal. In short, I didn't get it.

Fast forward a few years and several hours of full time Clojure development and you'll find me adding interaction based testing to expectations. What I had in mind for testing interactions was simple, I want to write exactly the same thing for the test as what I write for the production code. Additionally, I want the format of the test to follow the same format that is used for state based testing: (expect expected actual)

Once I had a clear vision for my requirements, the format of the tests became easy to visualize. Assume I have a function that prints to standard out, and I want to test that this print occurs.

The above test looks great, but (println 5) will be evaluated, return nil, and use nil as the expected value. I needed some way for the programmer to tell the testing framework that this was an interaction test, and expectations needed to verify that the function was called with the specified parameters. After trying a few different formats, I settled on the following solution.

By wrapping the interaction I wanted to test with (interaction ...), I created an easy way to identify and capture the function and arguments that needed to be verified.

Once I'd decided on the syntax, I went about the task of adding support to expectations. If you dug into the implementation of expectations, you'd find that expect is a macro that delegates the handling of the "expected" and "actual" arguments to the doexpect macro. The first thing the doexpect macro does is check if expected is a list and (if so) if the first argument is the symbol "interaction" (source here). If the first argument is not a list that begins with 'interaction, then the data is passed to do-value-expect and expanded more or less as is. However, if the first argument is a list that begins with 'interaction, then the data is passed to do-interaction-expect, and do-interaction-expect then destructures the data, grabbing only the pieces of the list that it cares about (source here). When I wrote this code, I found it very interesting.

When I envisioned the interaction syntax, I assumed that (interaction ...) would be a call to a macro, and I would need to need to manipulate the data passed to interaction. However, once I got into the actual implementation, I found myself using the symbol "interaction", but never actually defining a macro or even a function. That's when homoiconicity really started to become clear to me. I'd written code that I was sure would need an implementation, yet it was used exclusively as data.

If you kept digging into this example you would find that anything found within (interaction ...) is never used as written, but is instead expanded in a way that allows expectations to rebind the specified function and use the expected arguments at verification time. As a result, you write the same code in the same way but within your test it's used exclusively as data and in your production code it's used exclusively as code. I'm a big fan of convention, and there's no better convention than 'use the exact same thing'.

I later added the ability to add interaction tests for calls to Java objects as well, which led to the following behavior for expectations.

If your expected value is not an interaction, it will be expanded as is.
If your expected value is an interaction with a Clojure function, it will be used as data exclusively and expanded to rebind the function, capture all calls to the function and verify that a call occurred with the arguments you specified.
If your expected value is an interaction with a Java method, it wil be used as data exclusively and expanded to mockito setup and verification code.

Thus, an expected value is sometimes code, and sometimes data.

Thursday, January 03, 2013

Clojure: Expectations Warn On State Change During Test Runs

While writing tests it can be easy to accidentally change any global state that exists in your application. I've previously written about Redefining State Within a Test; however, redef-state and with-redefs only help you if you redefine all of the affected state. The situation is even more problematic due to the fact that accidental state alteration often doesn't cause issues until a completely unrelated test suddenly fails. After being bitten by this issue a few times, I added (to expectations) the ability to warn when global state is modified by a test.

As of version 1.4.24 if you add (expectations/warn-on-iref-updates) anywhere, then expectations will provide you with a warning whenever any global state is altered.

While you can add that snippet anywhere, I prefer to add it to the expectations Before Run Hook. There's an example expectations_options.clj in the expectations codebase that shows all of the code you need to enable this feature. Simply add this file or add the function to your existing file and you should see a warning on any global state alteration. If you're not sure where to put this file, refer to the Before Run Hook blog post.

Here's an example warning (generated by running the expectations tests).

WARNING: success.success-examples:280 modified #'success.success-examples-src/an-atom from "atom" to "another atom"

The warning should let you know which test is doing an unexpected modification, and the to and from values should give you an idea of where in the source the alteration is occurring.

Wednesday, January 02, 2013

Clojure: Expectations Before Run Hook

The expectations library now supports calling an arbitrary number of custom functions before the test suite is run.

There are several reasons that you might want to call a custom function before the test suite is executed-

load sample data to a database
delete a temp directory
replace logging with standard out or /dev/null

As of version 1.4.24, expectations will execute any function that includes the following metadata {:expectations-options :before-run}. The following example should serve as a decent reference on how to add your own 'before-run' functions.

These functions can be defined anywhere; however, expectations looks in a default location for a configuration file. If expectations finds this configuration file it removes (what should be) the namespace and requires it with :reload. As a result, this configuration namespace will be removed and redefined with each expectations suite run.

The name of this default configuration file is expectations_options.clj (thus the namespace is expectations-options). Expectations looks for this file in the root test directory. If you have :test-paths ["test/clojure"] in your project.clj, then you'll want to create test/clojure/expectations_options.clj.

If you place your expectations_options.clj file in the correct location and you add the {:expectations-options :before-run} metadata to a function in the expectations-options namespace, your function should be run automatically the next time your test suite runs. You may want to start with a (println) just to verify that things are going as you expect.