Ramblings from the Warrior's Den
Wednesday, August 18, 2004
Just how well is your software being tested anyway?
Part 1: It runs, ship it!
Recently at work, several people were having a discussion regarding a new test bench being set up for some upcoming localization testing work on the project I'm working on. Assuming I didn't get totally lost in the back-and-forth conversation of exactly what sort of site heirarchy was needed for this effort, my role in all this is to set things up, but the numbers of servers needed, and exactly where these are supposed to come from varied wildly throughout the conversation. At one point, the number of servers required had reached as high as eighteen, before cooler heads (and presumably the available budget) prevailed. Fortunately for me and my my office mate, we managed to convince the test lead that this whole setup probably should be placed in a lab, rather than my office.
In addition to this discussion, This recent post by Chris of the Software Test Engineering @ Microsoft Blog, have led me to give some thought to the subject of exactly how well the software we put out is being tested. I'm probably going to do a series of posts on this subject, assuming I stop being lazy enough to write stuff every once in a while.
If you're familiar with the process of software testing, you know that the testing effort that goes into a typical piece of commercial software is extensive. You also know that there is only so much testing that you can do. Chris touched on this subject in his post:
This is one of the key problems of testing big software like this. You can test and test and test, but it's a bit frustrating to know that even with all your work someone, somewhere will have a problem with your stuff. I had a great conversation with two developers at lunch today about this. Once your software stabilizes testing becomes a low-yield game (tangent: at this point one of the developers cheerfully said “not when you're testing my stuff!”) You test a lot of surface area, and don't find many problems. This is one of the hardest times for testing, the temptation is to say “yep, it's done!”
This is a key dark art for shipping successful software. Everyone wants to be done, but someone has to have a good idea of when done actually happens. Ship too soon and you've got a buggy product out there. Ship too late and diminishing returns has kicked in so your quality isn't much better, but you just lost a lot of money by not being out in the marketplace. Every time software ships someone, or hopefully a group of someone's, thought about these issues an made a trade-off.
While this is true, it's not always quite that simple. Sure, the omnipresent spector of the schedule looms over your head, and you don't want to be the one who has to explain to the VP in charge exactly why you just slipped the ship date by six months, but it's not just a matter of popping out of your office one day and yelling "Ship it!" down the hall either. As a software tester, for the most part you're going to have someone (probably your lead, based on a schedule prepared by a Program Manager) telling you what to do at any given time. The process of getting a product to market is very structured. While this ensures that the basic functionality and features of a given product get tested thoroughly, it does also have the effect of having the testers spend quite a bit of time running through the same tests repeatedly. Depending on the way things are scheduled and the amount of test cases (and how much of it is automated,) some groups opt to run through a test pass on a weekly basis, especially as they approach a major milestone, which generally requires zero active bugs, followed by a period of escrow, in which no major bugs are found, before that particular milestone can be considered complete.
Of course, along the way, bugs are going to be found. Lots of them. Actually, I should probably back up a second here, and point out that in terms of the software development process, "Bug" is something of a generic term. What most people would think of as a bug is generally referred to as a "code defect". The term "Bug" is used to refer to just about any issue with the product that needs attention, be it a code defect, a string that needs to be fixed, a doc issue, s DCR (Design Change Request) or any number of other issues that may crop up. Just because you've filed a bug against the product doesn't mean that it'll be fixed either. Dev resources are lmiited (in fact, a lot of teams have less devs than testers, although I've heard it recommended that for ideal coverage a 1:1 ratio should be maintained.) If you sent all the bugs directly to the devs to fix, you'd be more likely to end up with your dev team barricaded in their office and surrounded by the SWAT team than you would to end up with a quality product. Therefore, you need what's known as a triage to sort things out.
To a lot of testers, the triage is some sort of mysterious process where bugs go in to get chewed up and spit out, often with great force. To be honest, having only recently worked on a team where more than 3 or 4 people attend the triage, I actually think I might prefer that the process remained a bit more mysterious. Getting your bugs tossed back in your face with a big "Postponed" or "Won't Fix" stamp is one thing, having it happen while you're sitting there watching is another. I can now see why it is that it is generally recommended that triage be left to the test leads and the PMs. Of course, even if you are going have half the bugs you file thrown out or punted (especially as you're getting close to shipping,) triage serves a vital purpose. in order to keep things on track, you need to make sure that the major issues get taken care of first, and also that they get to the appropriate people to take care of them. In the end, this is also going to leave a significant number of Won't Fix bugs in the product that didn't meet the criteria for being fixed. These are generally left to the tester's discretion as to whether or not they want to push back on them and try to get a fix. For the most part, these are going to be relatively minor issues anyway, and often have little effect on the end user experience.
Next time: Some discussion about the environments in which software is tested.