Why Testing in Production isn’t as stupid as it sounds

One of my colleagues this week was consolidating the results from our recent Application Performance Management survey, and one interesting finding was that 40% of customers have at least one release cycle a month. Out of those respondents, one third experience a Severity-1 incident each month as well. That’s a pretty compelling pair of statistics, and they might explain the continued frustration and conflict between development and operations teams. It’s also perhaps the reason why this DevOps underground movement can no longer be ignored (even by Gartner). There is no doubt development organizations have become agile, but does deploying this frequent change make the business more or less agile? For example, if one in three releases creates a Severity-1 incident, then surely agile development becomes a risk to the business. We’re at the point where Operations either has to start managing change better or simply restrict the amount of change that can occur.

So why are Sev 1 incidents so common? Based on my experiences and customer interaction, I’d strongly argue that testing in development isn’t enough. At the very least, it’s certainly not an insurance policy for deploying an application in production. When a Formula 1 team designs a car in a wind tunnel and tests it on a simulator pre-season, they don’t assume that the performance they see in test will mirror the results they see in a race. Yet, that is pretty much what happens today in the application development lifecycle. Development teams build and test their apps in pre-production before handing it off to operations for deployment in production, and they assume everything will work just fine. This is probably the worst assumption IT has made over the last decade, because development and production environments differ significantly. It’s also a lame excuse for any development team to use when a production issue occurs: “Well it ran fine in test so you must have deployed it wrong.” Yes people make mistakes occasionally, but if one in every three releases has an issue, deployment error may not be the sole reason. If development never get to see how their baby runs in production, they’ll never learn how to build robust, scalable, and high-performance applications.