Tuesday, June 28, 2016

Curious Customer

I currently work on a pretty small team, 4 devs (including myself). We have no one dedicated strictly to QA. A few years ago we ran into a few unexpected issues with our software. I hesitate to call them bugs, because they only appeared when you did things that made little sense. We write internal-only software, thus we expect a minimum level of competency from our users. In addition, it's tempting justify ignoring problematic nonsensical behavior in the name of not having to write and maintain additional software.

But, when I wasn't in denial, I was willing to admit that these were in fact bugs and they were costing us time.

The problems caused by these bugs were small, e.g. a burst of worthless emails, a blip in data flowing to the application. The emails could be quickly deleted, and the application was eventually consistent. Thus I pretended as though these issues were of low importance, and that the pain was low for both myself and our customers. I imagine that sounds foolish; in retrospect, it was foolish. The cost of developer context switching is often very high, higher if it's done as an interrupt. Introducing noise into your error reporting devalues your error reporting. Users can't as easily differentiate between good data, a blip of bad data due to something they did, and actual bad data, thus they begin to distrust all of the data.

The cost of these bugs created by nonsensical behavior is high, much higher than the cost of writing and maintaining the software that eliminated these bugs.

Once we eliminated these bugs, I spent notably more time happily focused on writing software. For me, delivering features is satisfying; conversely, tracking down issues stemming from nonsensical behavior always feels like a painfully inefficient task. I became very intent on avoiding that inefficiency in the future. The team brainstormed on how to address this behavior, and honestly we came up with very little. We already write unit tests, load tests, and integration tests. Between all of our tests, we catch the majority of our bugs before they hit production. However, this was a different type of bug, created by behavior a developer often wouldn't think of, thus a developer wasn't very likely to write a test that would catch this issue.

I proposed an idea I wasn't very fond of, the Curious Customer (CC): upon delivery of any feature you could ask another developer on the team to use the feature in the staging environment, acting as a user curiously toying with all aspects of the feature.

Over a year later, I'm not sure it's such a bad idea. In that year we've delivered several features, and (prior to release) I've found several bugs while playing the part of CC. I can't remember a single one of them that would have led to a notable problem in production; however all of them would have led to at least one support call, and possibly a bit less trust in our software.

My initial thought was: asking developers to context switch to QAing some software they didn't write couldn't possibly work, could it? Would they give it the necessary effort, or would they half-ass the task and get back to coding?

For fear of half-ass, thus wasted effort, I tried to define the CC's responsibilities very narrowly. CC was an option, not a requirement; if you delivered a feature you could request a CC, but you could also go to production without a CC. A CC was responsible for understanding the domain requirements, not the technical requirements. It's the developers responsibility to get the software to staging, the CC should be able to open staging and get straight to work. If the CC managed to crash or otherwise corrupt staging, it was the developers responsibility to get things back to a good state. The CC doesn't have an official form or process for providing feedback; The CC may chose email, chat, or any mechanism they prefer for providing feedback.

That's the idea, more or less. I've been surprised and pleased at the positive impact CC has had. It's not life changing, but it does reduce the number of support calls and the associated waste with tracking down largely benign bugs, at least, on our team.

You might ask how this differs from QA. At it's core, I'm not sure it does in any notable way. That said, I believe traditional QA differs in a few interesting ways. Traditional QA is often done by someone whose job is exclusively QA. With that in mind, I suppose we could follow the "devops" pattern and call this something like "devqa", but that doesn't exactly roll off the tongue. Traditional QA is also often a required task, every feature and/or build requires QA sign off. Finally, the better QA engineers I've worked with write automated tests that continually run to prevent regression; A CC may write a script or two for a single given task, but those scripts are not expected to be valuable to any other team member now or for anyone (including the author) at any point in the future.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.