Tuesday, September 30, 2008

Testing Dynamic Web Applications

At RailsConf Europe, in the Q & A portion of my talk on Functional Testing someone asked what I recommend for testing Javascript.

Ugh. Testing Javascript. Is it possible to recommend something when everything you've seen is terrible? Usually I'm cool with picking the tool that sucks the least, but when it comes to Javascript testing the only words that come to mind are: epic fail.

In the past I've failed in two different ways.
  • Selenium: Too slow and to brittle for any decent size test suite. There's more on that if you wish.
  • Some javascript unit testing framework. I can't even remember the name. The syntax was ugly, the tests weren't easy to write, and the runner didn't integrate with any automated tools we used -- lame.
My response in Berlin was "The pivotal guys are smart, so if I were going to try something new it would be Screw Unit." Of course, I just googled and found the RubyForge project and two with the same name on GitHub. I'm sure one of those is the current trunk.

A few days later it occurred to me what the correct answer was -- Don't use Javascript and you won't have to test it.

No, I'm not suggesting that we should all go back to mostly static websites. Static content is fine for some things, but GMail is an obvious example of a site that is better done dynamically.

However, Javascript isn't your only choice for highly dynamic websites.

These days, if I were writing a website that required any dynamic interaction I would absolutely use Flex or Silverlight. I've done Flex, and it was nice to work with, but I must admit that I'm lured to Silverlight because it's going to (or does already?) support Ruby.

I'm not sure what the Silverlight testing story is, but I found Flex (and ActionScript) to be quite testable. The single biggest win (as I've said before) is that I no longer need to do in-browser testing. Removing the browser from the equation is huge. No more IE bugs causing your tests to fail, no long start up times as the browser is run, etc.

Testing with FlexUnit (with it's drawbacks) is an order of magnitude better than any experience I've had testing Javascript.

I'm comfortable saying that I would still use Javascript for trivial features that provided so little business value that they did not warrant testing. However, any features that provide noticeable business value must be tested, and I would move to a RIA solution instead.

In my experience, the benefits of switching to a RIA solution are dramatic, one of the largest being: you no longer need to worry about testing Javascript.

Monday, September 22, 2008

When To Retire Your Brand

Building a brand takes a lot of effort, but I think the payoff justifies the investment. Having a strong brand definitely helped me find a fun and very well paying job. So now that I have a dream job (@ DRW Trading), what should I do with my brand?

I have to confess, I didn't start writing because I wanted share information. I started writing because I wanted to build a big brand, find a great job, and enjoy life. Somewhere along the way I began to enjoy writing and the positive results that came from knowledge sharing. Someone once said to me "I write better tests because of your blog". Obviously I was happy to hear that kind of feedback. At the same time it wasn't the reason I got started down this path.

I love to program. I also love to be in shape (which I'm not) and learn other languages (which I haven't been doing lately).

Now that I have my dream job, I've been spending more time doing the other things I love. Unfortunately, I found that between learning an interesting domain, going to the gym, and learning Italian, there's not much time left for blogging. I also find that since I'm doing all the things I love, I don't really like to be away presenting at conferences.

I thought it might be time to declare success on the brand building project, and move on to new pursuits... But, I was wrong.

The largest reason I can't quit writing and presenting is that I enjoy giving back to the community. Seeing a blog entry get 8,000 hits in one day causes an amazing feeling. Giving a presentation and getting feedback that says "Probably the best presentation at the conference" definitely makes you feel good about what you are doing. Seeing an idea become committed as the way to do something will definitely make you smile. I truly enjoy spreading ideas (or at least attempting to spread ideas) that help the community evolve.

Blogging and presenting also help me personally improve. The easiest way to get feedback on something is to put it out there. I considered several of my testing ideas to be "the right way" for far too long. Putting them down as blog entries resulted in further evolution of the ideas as well as a greater understanding of how context determines the correct approach. Simply writing about my ideas improves them. One thing we aren't short of is people to tell you you're wrong.

Your brand is also valuable to your employer. Employing people with name recognition improves your organization's ability to recruit talented new hires. This also directly benefits you, since you'll be given the opportunity to work with more talented teammates. At the moment, DRW is looking to hire the absolute best people in the industry. I wish I had an even stronger brand, so I could help attract the top talent.

Ultimately I came to the conclusion that building a brand is a career long activity. You can stop at any time, but getting that free time back comes at cost to your profession.

Monday, September 15, 2008

Is Distributed Development Viable

I've never seen distributed development succeed. However, before we get into what I've seen, I need to be specific about what I'm describing.
Distributed Development: A group of individuals who work across time, space, and organizational boundaries with links strengthened by webs of communication technology
The link above for Distributed Development redirects to Virtual Team. Wikipedia also has an entry for Distributed Development; however, I found the description of Virtual Team to be much closer to what our industry generally labels Distributed Development.

Just to be clear I'm not talking about outsourcing (subcontracting a process, such as product design or manufacturing, to a third-party company) or what you might call off-shoring.

This entry will be written with that in mind.

Take 1
Several years ago I worked with one of the nicest and most talented developers I expect I'll ever encounter: Badri. We worked together for about a month before he was forced back to India to deal with US visa issues. We knew this was going to happen and decided that it was so important to keep Badri on the project that we were going to give distributed development a shot. Badri knew the code, the team, and he was great on so many levels that the choice couldn't have been more obvious.

After just a few weeks, we gave up on distributed development. Ever since that experience, I've been highly skeptical of the feasibility of distributed development. I had actual experience working with one of the greatest developers I'd ever encountered, and we still failed.

Failure can be good. There are always lessons to be learned when you fail. Here were a few that I picked up from that project.
  • Time difference is probably the single largest contributing factor to failure. Badri was in India, and we were in Deleware, USA. There was almost no time overlap to our workdays.
  • Having the local or on-site developers feed requirements back to the off-site team equates to an order of magnitude efficiency loss.
  • Having the on-site developers decide what pieces are going off-site quickly translates to the off-site team being pissed off about getting the boring work
  • Businesses expect the same level of quality and productivity from the on-site members as the off-site members.
Given those circumstances, we definitely failed. Epic failed in fact. And, to make things worse, I was the one that had to tell Badri that we were going to give up on distributed development. Imaging having to tell someone you have nothing but admiration for that "it's just not working out". I still feel guilty about it to this day.

The time difference was a huge killer. Every time we spoke to Badri either he was half asleep or we were. I think Badri was working 12 hour days also, just so he could try to talk to us at the beginning and end of his day. It also meant no wine with dinner or having a bit of a buzz for the 11pm stand-up. Either of those options is unacceptable. Worse, if we didn't communicate what we needed in detail, the off-site team would be off in the wrong direction for up to 12 hours straight. We threw away several days worth of code because we hadn't given them enough detail to go in the right direction.

Truthfully, we (the on-site developers) shouldn't have been deciding what they were going to work on in the first place. It was a tough project and we were all learning. None of us felt like we had the time to do both the development required of us, plus the analysis required to give the off-site team good direction. Instead, the off-site team got minimal direction and delivered code that was of minimal use to us. It was beautiful code, but it wasn't what we needed. Again, epic fail.

Even though we clearly were not functioning well, the business expected that the off-site team deliver at the same pace as the on-site team. The business had met and loved Badri. No one could understand why our pace had suddenly dropped off. Not good. More on this later.

Take 2
A few years later I joined a project that was just beginning to give distributed development a shot. I hadn't been there 3 days and I was already in the CEO's office explaining why I thought it was a bad plan. Of course, I was gun-shy, so that was to be expected. Luckily, Fred George was also there and he was more optimistic. In the end we went to an on-site team, but Fred showed me that distributed development is possible.... maybe.... eventually.

We experienced several obstacles, and we overcame some. Again, there were plenty of lessons to be learned.

Time difference was less of a factor. This time I was working in London and the off-site team was in India. Our workdays overlapped fairly well. This was obviously a huge move in the right direction. We also collaborated on who would work on what, which helped ensure that everyone was happy with what they were working on. And, the analysis came from a BA that collaborated with both portions of the team. The BA even traveled to India occasionally.

However, things were still quite broken.

Some members of the team had never met other members of the team. There's something about working, collaborating with someone in person that is almost impossible to replicate over a phone or IM conversation. It wasn't that we didn't want to get everyone together, but there were visa issues that couldn't be overcome. It was a big mistake with no good resolution. Some long time team members couldn't travel, but if we replaced them with people who could travel we would lose domain knowledge and context.

Communication was a constant problem. The telephones on both ends presented problems. I have plenty of Indian friends that I have no problem understanding, but with the off-site team speaking into a bad connection and it being sent out of our bad connection, I caught every other sentence, if that. To make matters worse, people weren't talking nearly as often as they needed to be. We tried to address this by creating a chat room and mandating daily checkpoint phone calls. Both ideas were abandoned due to lack of buy-in. In the end we never did solve the communication issue -- in my opinion. I think the reality is that most programmers are introverted, and being off-site just gives you one more excuse for not talking to the business, even when you really should.

Connectivity was also an issue. If you're in the US, you don't really even think about the reliability of your internet connection. However, in many other places in the world connectivity is much less certain. There were several occasions where the off-site team was simply unable to check-in their code. We never did figure out if the problem was on their side, our side, in our vpn, or some other bizarre location. If your off-site can't check in, IM you, or get to the wiki, you obviously lose productivity.

There was one common problem: the business expected the same level of productivity from both teams. I'm not sure there's a way around this issue. Have you ever heard that programming is about people? Of course you have, and you obviously believe it. But, does the business know that? Even if they've heard it, do they believe it?

I doubt your business does. Frankly, I doubt most of your colleagues in the industry do. I still know far too many programmers who think that their only responsibility is to write code that creates features. If that were true, if we were nothing more than line workers delivering feature after feature, it would be plausible that productivity shouldn't suffer no matter where the factory exists.

But, it's not true. Being a software developer is about communication, collaboration, analysis and coding, at least. Being a phone call away instead of a face to face conversation away impacts communication and collaboration. Thus, productivity is impacted. There's no getting away from that.

How can you convince your business of that? I haven't solved that problem yet, unfortunately.

What could be
I do think Fred had the right ideas. He described scenarios that had previously benefited from distributed development, and what made those situations succeed.

The first suggestion is to get everyone together. You want the team to gel as one entity. Do version one entirely on-site if possible. However, don't do it with all local resources. Bring people from the desired off-site location to work on-site for the first release. The team members will build trust and friendships that last up to 6 months.

Once you are on version two you can move the desired team members back off-site. However, the travel isn't over for anyone. The off-site team should always have at least one member from the on-site team with them, likewise the on-site team should always have at least one member of the off-site team present. These aren't week long trips either. Each member of the team should visit the other location for a month, once every five to six months. That level of in-person communication should lead to high levels of trust and understanding.

The travel situation is even more drastic for the analysts and stakeholders. They should split their time between the two teams, if possible. Neither team should feel like the A team or the more important team. Any implication that one team is above the other team will lead to negative productivity impacts.

That might sound drastic, but it's the price of doing off-site development. The cost doesn't stop there.

Both team locations are going to need to invest heavily in infrastructure. The best video conferencing software and highly reliable bandwidth will also need to be purchased. The idea is to foster communication in every way possible. Without communication and collaboration, the project is doomed.

Open Source
Before anyone points out that it's hard to argue with the success of Open Source, I'd like to be clear -- I'm not. Open Source is obviously successful and developed most often in a distributed manner. However, there are a few differences that, I believe, make it a different situation entirely.

First of all, most people aren't paid to work on Open Source. When someone isn't paying you, you can often do whatever you feel like, whenever you feel like it. If someone is working in the same area of the code as you need to, you can just put off your changes until they are done.

However, the reality is that most people aren't usually working in common areas when they work on open source. Most open source projects are maintained by a few people who work within specific portions of the codebase. If changes need to happen in "your" portion of the codebase, you often queue them up to work on after you finish your current task.

Since there's little conflict between what you work on and what other team members work on there's significantly less communication and collaboration required, and what is necessary can happen at a much slower pace. If you need to make a change, it doesn't often need to happen right away. It can be put off until the team member on the other side of the world wakes up.

The codebase also evolves at a much slower pace. Six to ten people working in the same codebase 8 hours a day move much faster than the average Open Source project that sees 3 developers working a few hours a day.

Distributed development does work for Open Source, but that's not what I'm talking about.

Conclusion
I have heard of companies successfully doing remote pair programming and distributed development. One recipe I've heard is that everyone is off-site in different locations. I can see how that would work since it requires everyone to adopt a new work routine and make the best of it. While I believe it's possible to be successful, I think it's still bleeding edge at this point. You probably don't want to "try this at home" quite yet.

I think Distributed Development is probably the way of the future. As bandwidth and experience is more available the industry will continue to evolve in that direction. However, I think it's still probably about 5-10 years from being mainstream.

Friday, September 12, 2008

Refactoring: Ruby Edition available on Safari

Refactoring: Ruby Edition is now available on Safari as a Rough Cut.

Monday, September 08, 2008

Domain Specific Languages don't follow the Principle of Least Surprise

Ola Bini gets it right, as usual, in Evil Hook Methods?, but I think you can actually take the idea a bit further.

DataMapper allows you to gain it's methods simply by an include statement.

class Category
include DataMapper::Resource
# ...
end

Ola points out that the include method should not add class methods. At least, that's not what it was designed to do. Include should (by way of append_features) add the constants, methods, and module variables of this module to the class or module that called include.

The problem for me is: Should DataMapper add it's methods to your class when you use Ruby methods as they were originally intended, or when you use DataMapper's Domain Specific Language (DSL).

If DataMapper is a framework and should be used traditionally then you should add it's methods in the following way.

class Category
include DataMapper::Resource
extend DataMapper::Resource::ClassMethods
# ...
end

However, you can't blame DataMapper for following the pattern that's been around since long before DataMapper. At this point I would consider the trick to definitely be an idiom even if it is an anti-pattern. The reality is that include has been stolen by those that prefer the simplest possible Domain Specific Language for adding their behavior.

Martin Fowler describes how a framework can have a traditional API as well as a thin veneer that allows you to use the framework in a more fluent way.

Unfortunately, in the Ruby world we've designed our veneer in a way that doesn't allow for traditional usage.

The other day I noticed something that I thought was equally interesting in the Java world. I was working on a test that used JMock and IntelliJ formatted my code as shown below.

    1 class PublisherTest extends TestCase {
2 Mockery mockery = new Mockery();
3
4 public void testNamesAreAnnoying() {
5 final Subscriber subscriber = context.mock(Subscriber.class);
6
7 mockery.checking(new Expectations() {
8 {
9 one (subscriber).receive(message);
10 }
11 });
12
13 // ...
14 }
15 }

Unimpressed by lines 8 and 10, I changed the code to look like the following snippet.

class PublisherTest extends TestCase {
Mockery mockery = new Mockery();

public void testNamesAreAnnoying() {
final Subscriber subscriber = context.mock(Subscriber.class);

mockery.checking(new Expectations() {{
one (subscriber).receive(message);
}});

// ...
}
}

Mike Ward said I shouldn't do that because the IntelliJ formatting properly shows an initializer for an anonymous class. Which is absolutely correct, but I don't want an anonymous class with an initializer, I want to use JMock's DSL for defining expectations. And, while the second version might not highlight how those expectations are set, that's not what I care about.

When I write the code I want to create expectations in the easiest way possible, and when I read the code I want the fact that they are expectations to be obvious. I don't think removing lines 8 and 10 reduces readability, in fact it may improve it. Truthfully, I don't care what tricks JMock uses to define it's DSL (okay, within reason), I only care that the result is the most readable option possible.

Back to DataMapper, I believe there's a superior option that allows them to have both a clean DSL and a traditional API. The following code would allow you to add methods as Ola desires (traditionally) and it would allow you to get everything with one method invocation for those that prefer DSL syntax.

class Object
def data_mapper_resource
include DataMapper::Resource
extend DataMapper::Resource::ClassMethods
end
end

class Category
include DataMapper::Resource
extend DataMapper::Resource::ClassMethods
# ...
end

class Entry
data_mapper_resource
# ...
end

The obvious drawback is if everyone starts adding methods to Object we may start to see method collision madness. Of course, if the method names are given decent names it shouldn't be an issue. It's not likely that someone else is going to want to define a method named data_mapper_resource.

Don't worry. For those of you who prefer complexity "just in case", I have a solution for you also.

module DataMapper; end
module DataMapper::Resource
def self.instance_behaviors
DataMapper::Resource::InstanceMethods
end

def self.class_behaviors
DataMapper::Resource::ClassMethods
end
end
module DataMapper::Resource::InstanceMethods
def instance_method
"instance method"
end
end
module DataMapper::Resource::ClassMethods
def class_method
"class method"
end
end

class Object
def become(mod)
include mod.instance_behaviors
extend mod.class_behaviors
end
end

class Category
include DataMapper::Resource::InstanceMethods
extend DataMapper::Resource::ClassMethods
# ...
end

class Entry
become DataMapper::Resource
end

Entry.class_method # => "class method"
Entry.new.instance_method # => "instance method"
Category.class_method # => "class method"
Category.new.instance_method # => "instance method"

Thursday, September 04, 2008

Ruby: Recording Method Calls and Playback With Inject

Sometimes you want to call methods on an object, but you want to delay the actual execution of those methods till a later time.

For example, in expectations you create a mock at parse time, but you actually want the mock to be available at execution time.

class SystemProcess
def start
puts "started"
new StartedProcess
end
end

Expectations do
expect SystemProcess.new.to.receive(:start) do |process|
process.start
end
end

In the above code you define what you expect when the file is parsed, but you actually want the process expectation to be set when the do block is executed. This can be (and is) achieved by using a recorder to record all the method calls on the process object. At execution time the method calls are played back and the initialized process object is yielded to the block.

The code for a recorder is actually quite trivial in Ruby.

class Recorder
attr_reader :subject
def initialize(subject)
@subject = subject
end

def replay
method_stack.inject(subject) { |result, element| result.send element.first, *element.last }
end

def method_stack
@method_stack ||= []
end

def method_missing(sym, *args)
method_stack << [sym, args]
self
end
end

Here's an example usage of a recorder.

class SystemProcess
def start(in_seconds)
puts "starting in #{in_seconds}"
sleep in_seconds
StartedProcess.new
end
end

class StartedProcess
def pause(in_seconds)
puts "pausing in #{in_seconds}"
sleep in_seconds
PausedProcess.new
end
end

class PausedProcess
def stop(in_seconds)
puts "stopping in #{in_seconds}"
sleep in_seconds
self
end
end

Recorder.new(SystemProcess.new).start(1).pause(2).stop(3).replay
# >> starting in 1
# >> pausing in 2
# >> stopping in 3

The only thing worth noting is that by using inject you can use a method chain that returns different objects. Traditional versions of a recorder that I've seen often assume that all the methods should be called on the subject. I prefer the version that allows for object creation within the fluent interface. In practice, that's exactly what was needed for recording and playing back Mocha's expectation setting methods.

Tuesday, September 02, 2008

Passionate, Not Dogmatic

Ted Neward recently wrote a blog entry that began with the following text:
... the debates have begun, with all the carefully-weighed logic, respectful discourse, and reasoned analysis that we've come to expect and enjoy from this industry.

Yeah, right.
Ted's comment is funny because it's true. Ted's comment is also disappointing... because it's true.

In the past 3.5 years I had the opportunity to interact with some of the smartest people in our industry. I consider many of those smart people to be among the best software developers in the world. Unfortunately, some of the smart people I met weren't much more than assholes. The big difference I noticed between the two groups was -- The assholes were dogmatic, while the best developers were passionate
passionate: expressing, showing, or marked by intense or strong feeling
dogmatic: asserting opinions in a doctrinaire or arrogant manner
--dictionary.com
The difference between passionate and dogmatic is slim, but the result is dramatic.

Martin Fowler is a great example of someone who is passionate without being dogmatic. For example, Martin is a classicist, he prefers state based testing. However, in Mocks Aren't Stubs Martin examines both points of view and makes no assertion on which was is absolutely correct. That's a tough thing to do, but the result is a classic article that both mockists and classicists often refer to. That's just one example, but almost every article by Martin provides at least 2 points of view. The result is extremely valuable.

I used to be dogmatic. I have no problem admitting it. My earlier writing is clearly arrogant and often shortsighted. Part of the problem was lack of experience. When you take an immature industry and give a platform to someone with limited heuristics you are bound to receive solutions with limited applicability.

As I gained more experience I realized that what I considered to be best practices were only best practices within certain contexts. I also realized that presenting something as the "one true way" only benefited those that worked within exactly the same context that I worked. People who follow my advice when it doesn't apply to their context must fail. The advice isn't flawed, but it is incomplete. You need to see the full picture.

Two interesting things happen when you write passionate entries instead of dogmatic entries.
  • Your advice is more widely and appropriately used.
  • Your traffic goes down significantly.
On a recent podcast Joel Spolsky noted that most advice needs contextual information. Unfortunately, contextual information implies that the advice isn't universally applicable. While that's great for a small subset of professionals who are interested in best practices and improvement, the vast majority of people in our industry are still in search of silver bullets. The easiest way to ensure that your advice is missed by the majority of the industry is to spend the first 2 paragraphs of the entry describing the context.

Conversely, our industry loves dogmatic advice. For example, DRY is blindly, dogmatically followed. I'm not a fan of blindly following DRY, so I wrote about the value of duplication within tests. I attempted to give a counter point of view with contextual examples, but at the end of the day the entry got a large amount of traffic solely from the authoritative title.

It's easy to spot the difference between a dogmatic entry and a passionate entry. The dogmatic entry focuses on the best practice alone. However, a passionate entry gives equal weight to context and the best practice. Passionate entries are much more likely to see successful application, even if they don't make the top of reddit.

Preferring passionate to dogmatic entries is ultimately good, but you suffer in the short term. A career in software development is obviously a long term play, but you can't always blame someone for looking for short term gains. Of course, Martin Fowler is an example of the success you can achieve by sticking with passion over dogmatism.

There is one large upside to being passionate instead of dogmatic: You gain significantly more opportunities to learn. I can't count the number of times in the last year that I've said "I prefer the way I'm suggesting because I know it works, but let's do it your way and see if it's superior". (credit: I'm fairly sure I stole that phrase from George Malamidis while we worked together at TrafficBroker) Sometimes I was right, sometimes I was wrong, but I always learned something by trying a new approach.

That upside is what I believe truly separates the best in our industry from the assholes. The passionate leaders are constantly learning the best ways to do things, while the dogmatic leaders have stopped evolving their approach. As I said before, both groups are smart, but the dogmatic developers can only get by on their wisdom for so long. Eventually all dogmatic leaders become irrelevant.