15 May 2015
In a recent update we made to a customer’s software system, a bug was discovered, which led to a discussion about (a) why the bug wasn’t caught during testing and (b) why we charge them to fix the bug.
Given our business model, this is a discussion that occurs from time to time, and so I decided to capture our viewpoint in the form of an article I can refer to as needed.
For this particular customer, we maintain a complex software system—grown over a period of nearly 10 years—that drives a network of related websites. If you were to write down every functional behavior supported by the system, the list could easily grow to several thousand. And if you consider functional dependencies—behaviors that are related to other behaviors—the list could grow to hundreds of thousands.
These numbers are back-of-the-envelope estimates, but one thing is clear: For a system like this—in which it would be prohibitively expensive to test every functionality of the system—“testing” (whether it’s the coding of automated tests or the execution of human-conducted tests) becomes a question of cost vs risk. Each time we develop a new feature or modify an existing one, a decision has to be made as to the scope and amount of testing to conduct, hopefully striking a reasonable balance of cost versus the consequences of missing a bug.
While NASA or an airplane manufacturer might spend 99% of their overall budget on testing—since bugs can result in the loss of life—we, on the other hand, might only allocate 5% of a particular update’s budget to testing, if the consequence of a related bug would likely be a “page not found” in the archive section of a website.
It doesn’t take long for a software system to become sufficiently complex that the testing of all possible functionality becomes be prohibitively expensive or time consuming. This leads to a situation in which we have to accept that a delivered system may have bugs.
In fixed-priced software projects, there is usually a warranty period in which the correction of any discovered bugs will be provided without additional costs. Providing such a warranty, however, represents a risk to the developer, which has to be compensated as a component of the fixed-price they charge.
Several years ago, for a number of reasons, we at Makalu stopped working on the basis of fixed-price, and began charging for our time on an hourly basis. Since we accept that all non-trivial software systems may have bugs, and since we don’t compensate for a warranty period with a premium added to our hourly rate, then it’s natural and justified that we charge for the time needed to correct bugs that are discovered.
There are situations, however, in which we don’t charge customers for fixing things, and that’s when a bug or operational problem exposes fundamental mistakes we’ve made in the underlying architecture or engineering of the system. In such cases, which, although quite apparent to us, are usually not perceived by the customer, we’ll correct the problem without charge, since, in our view, the customer should reasonably expect that the system they paid for is adequately engineered.
As you can imagine, these kinds of corrections can be the most costly. Fortunately, we don’t make engineering mistakes that often, and we’ve observed that the incidence rate decreases over time with experience.