Tuesday, July 19, 2011

The High-Level Test Whisperer

Most teams have High-Level Tests in what they call Functional Tests, Integration Tests, End-to-End Tests, Smoke Tests, User Tests, or something similar. These tests are designed to exercise as much of the application as possible.

I'm a fan of high-level tests; however, back in 2009 I decided on what I considered to be a sweet-spot for high-level testing: a dozen or less. The thing about high-level tests is that they are complicated and highly fragile. It's not uncommon for an unrelated change to break an entire suite of high-level tests. Truthfully, anything related to high-level testing always comes with an implicit "here be dragons".

I've been responsible for my fair share of authoring high-level tests. Despite my best efforts, I've never found a way to write high-level tests that aren't filled with subtle and complicated tweaks. That level of complication only equals heartbreak for teammates that are less familiar with the high-level tests. The issues often arise from concurrency, stubbing external resources, configuration properties, internal state exposure and manipulation, 3rd party components, and everything else that is required to test your application under production-like circumstances.

To make things worse, these tests are your last line of defense. Most issues that these tests would catch are caught by a lower-level test that is better designed to pin-point where the issue originates. The last straw - the vast majority of the times that the tests are broken it is due to the test infrastructure (not an actual flaw in the application) and it takes a significant amount of time to figure out how to fix the infrastructure.

I've thrown away my fair share of high-level tests. Entire suites. Unmaintainable and constantly broken tests due to false negatives simply don't carry their weight. On the other hand I've found plenty of success using high-level tests I've written. For awhile I thought my success with high-level tests came from a combination of my dedication to making them as easy as possible to work with and that I never allowed more than a dozen of them.

I joined a new team back in February. My new team has a bunch of high-level tests - about 50 of them. Consider me concerned. Not long after I started adding new features did I need to dig into the high-level test infrastructure. It's very complicated. Consider me skeptical. Over the next few months I kept reevaluating whether or not I thought they were worth the amount of effort I was putting into them. I polled a few teammates to get their happiness level. After 5 months of working with them I began my attack.

Each time my functional tests broke I spent no more than 5 minutes on my own looking for an obvious issue. If I couldn't find the problem I interrupted Mike, the guy who wrote the majority of the tests and the infrastructure. More often than not Mike was able to tweak the tests quickly and we both moved on. I anticipated that Mike would be able to fix all high-level test related issues relatively quickly; however, I expected he would grow tired of the effort and we would seek a smaller and more manageable high-level test suite.

A few more weeks passed with Mike happily fielding all my high-level test issues. This result started to feel familiar, I had played the exact same role on previous projects. I realized the reason that I had been successful with high-level tests that I had written was likely solely due to the fact that I had written them. The complexity of high-level test infrastructure almost ensures that an individual will be the expert and making changes to that infrastructure are as simple as moving a few variables in their world. On my new team, Mike was that expert. I began calling Mike the High-Level Test Whisperer.

At previous points in my career I might have been revolted by the idea of having an area of the code that required an expert. However, having played that role several times I'm pretty comfortable with the associated risks. Instead of fighting the current state of affairs I decided to embrace the situation. Not only do I grab Mike when any issues arise with the high-level tests, I also ask him to write me failing tests when new ones are appropriate for features I'm working on. It takes him a few minutes to whip up a few scenarios and check them in (commented out). Then I have failing tests that I can work with while implementing a few new features. We get the benefits of high-level tests without the pain.

Obviously this set-up only works if both parties are okay with the division of responsibility. Luckily, Mike and I are both happy with our roles given the existing high-level test suite. In addition, we've added 2 more processes (a 50% increase) since I joined. For both of those processes I've created the high-level tests and the associated infrastructure - and I also deal with any associated maintenance tasks.

This is a specific case of a more general pattern I've been observing recently: if the cost of keeping 2 people educated on a piece of technology is higher than the benefit, don't do it - bus risk be damned.

Individuals Over People

I've been pondering a few different ideas lately that all center around a common theme: to be maximally effective you need to identify and allow people to focus on their strengths.

I hear you: thanks Captain Obvious.

If you are reading this it's likely that you're familiar with the phrase "Individuals and interactions over processes and tools" from the Agile Manifesto. I'm sure we agree in principle, but I'm not sure we're talking about the same thing. In fact, it's more common to hear "people over process" when discussing Agile, which I believe is more appropriate for describing the value that Agile brings.

Agile emphasizes people (as a group) over processes and tools. However, there's little room for "individuals" on the Agile teams I've been a part of. I can provide several anecdotes -
  • When pair-programming with someone who prefers a Dvorak layout a compromise must be made.
  • When pair-programming with someone who prefers a different IDE a compromise must be made.
  • Collective code ownership implies anyone can work on anything, which often leads to inefficient story selection. (e.g. the business accidentally gives a card to someone who isn't ideally skilled for the task. Or, a developer decides to work on a card they aren't ideally skilled for despite other better suited and equally important outstanding cards)
  • Collective code ownership requires a lowest common denominator technology selection. (e.g. If 3 out of 5 people know Ruby and 5 out of 5 know Clojure, selecting Ruby for any application, even when it's the appropriate choice, is likely to be met by resistance)
  • Collective code ownership requires a lowest common denominator coding style selection. Let's be honest, it's easy to code in a language such as Ruby without a deep understanding of metaprogramming and evaluation. Both metaprogramming and evaluation are powerful; however, you can only take advantage of that power if you are sure everyone on the team is comfortable with both techniques.
I could ramble on a bit more, but hopefully you get my point.

I'm still a believer in Agile. It's the best way I know how to take an average performing team and put them on a path to becoming a well performing team. However, I think the Agile practices also put a ceiling on how effective a team can be. Perhaps my favorite anecdote: Ola Bini believes he is 10x faster when using Emacs as compared to IntelliJ - when writing Java! 10x is huge, so what does he use when he's pairing: IntelliJ. If there's knowledge transfer occurring then perhaps the 10x reduction in delivery speed is a good decision; however, if he's pairing with someone of a similar skill level who isn't statistically likely to maintain the code in the future it's a terrible choice. Ola is programming at 1/10 of his possible efficiency for no reason other than it's the Agile way.

Clearly, it's gray area - the knowledge transfer level will vary drastically based on who he's pairing with and what they are pairing on. That's the important point - people are more important than processes, but individuals are the most important. If you want to achieve maximum productivity you'll need to constantly reevaluate what the most effective path is based on the individuals that make up your team.

If you already agree with the ideas above then you're probably familiar with the idea that you need to learn all the rules to know when to break them. That's an old idea as well. Unfortunately, I don't see much written on this topic in the software development world. The last 3 years of my life have been lesson after lesson of how smart people can break the Agile rules and get much larger gains as a result. I've decided to tag posts that cover this subject as "Individuals over People". There are even a few historical entires available at http://blog.jayfields.com/search/label/individuals%20over%20people.

Hopefully these ideas will spark a few discussions and inspire other post-Agile developers to post their experiences as well.

Tuesday, July 12, 2011

Undervalued Start and Restart Related Questions

How long does it take to start or restart your application?

Start-up time tends to be a concern that's often overlooked by programmers who write unit tests. It will (likely) always be faster to run a few unit tests than start an application; however, having unit tests shouldn't take the place of actually firing up the application and spot checking with a bit of clicking around. Both efforts are good; however, I believe the combination of both efforts is a case where the sum is greater than the parts.

My current team made start-up time a priority. Currently we are able to launch our entire stack (currently 6 processes) and start using the software within 10 seconds. Ten seconds is fast, but I have been annoyed with it at times. I'll probably try to cut it down to 5 seconds at some point in the near future, depending on the level of effort needed to achieve a sub-5-second start-up.

That effort is really the largest blocker for most teams. The problem is, often it's not clear what's causing start up to take so long. Performance tuning start-up isn't exactly sexy work. However, if you start your app often, the investment can quickly pay dividends. For my team, we found the largest wins by caching remote data on our local boxes and deferring creating complex models while running on development machines. Those two simple tweaks turn a 1.5 minute start-up time into 10 seconds.

If your long start-up isn't bothering you because you don't do it very often, I'll have to re-emphasize that you are probably missing out on some valuable feedback.

Not time related, but start related: Does your application encounter data-loss if it's restarted?

In the past I've worked on teams where frequent daily roll-outs were common. There are two types of these teams I've encountered. Some teams do several same day roll-outs to get new features into production as fast as possible. Other teams end up doing multiple intraday rollouts to fix newly found bugs in production. Regardless of the driving force, I've found that those teams can stop and start their servers quickly and without any information loss.

My current team has software stable enough that we almost never roll out intraday due to a bug. We also have uptime demands that mean new features are almost never more valuable than not stopping the software intraday. I can only remember doing 2 intraday restarts across 30 processes since February.

There's nothing wrong with our situation; however, we don't optimize for intraday restarts. As part of not prioritizing intraday restart related tasks, we've never addressed a bit of data-loss that occurs on a restart. It's traditionally been believed that the data wasn't very important (nice-to-have, if you will). However, the other day I wanted to rollout a new feature in the morning - before our "day" began. One of our customers stopped me from rolling out the software because he didn't want to lose the (previously believed nice-to-have) overnight data.

That was the moment that drove home the fact that even in our circumstances we needed to be able to roll out new software as seamlessly as possible. Even if mid-day rollouts are rare, any problems that a mid-day rollout creates will make it less likely that you can do a mid-day rollout when that rare moment occurs.

Tests and daily rollouts are nice, but if your team is looking to move from good to great I would recommend a non-zero amount of actual application usage from the user's point of view and fixing any issues that are road-blocks to multiple intraday rollouts.