- Myth of the nines
In
information technology , the myth of the nines is the idea that standard measurements ofhigh availability can be misleading. Availability is sometimes described in units of nines, as in "five nines", or 99.999%. Having acomputer system 's availability of 99.999% means the system is highly available, delivering its service to the user 99.999% of the time it is needed.The myth explained
The myth of the nines is the implicit assumptionFact|date=July 2008 that if the computer is operating 0.99999 of the time, then the user's business is operating 0.99999 of the time. In fact, this is often far from the truth. After an
outage , the humans using the computer have to scramble to catch up, perhaps apologising to customers, calling them back, entering data written on paper during the outage, and other unfamiliar chores. In the case of a drive failure, the server downtime might be small, but the time to restore from backup might be considerably longer. A computer outage of a minute might cause a business outage of hours.A further assumption in this model is that ten outages of one minute each have the same effect on the user as one outage of ten minutes. Again, this is not usually true. If a system is experiencing repeated outages, the user is justified in believing that the system cannot be trusted. In this case, the user may regard the computer as a liability. The user may measure ten one-minute outages over a period of six months as a downtime of six months, while the computers manufacturer measures it as a downtime of ten minutes. There is no way to calculate the number of outages over a given period from the uptime percentage alone.
Also, failure probabilities are not always independent. For example, a system made up of five-nine components does not have five-nine availability: each component must be up for the system to be up, and so the component uptimes must be multiplied to give the system uptime. A system of ten components (e.g., disks,
motherboard , PSU, RAM, mains-power, network...), each with 99.999% availability, only has approximately 99.99% overall availability.In contrast, a system might also be made of redundant parts, such that failure events are independent. This would have the effect of increasing up time, for the system to fail, two or more components need to fail. The best example of this is the RAID, where normally 2 or more drives need to fail for the RAID to fail. In this case, a system such as a level 5 RAID, 2 drives need to fail, and for 99.999 disks, the RAID uptime would be 99.99999999%. There are more complicating factors in redundant systems where the redundant components need to fail within the time it takes for the 1st component to be replaced.
Lastly, in many cases, "scheduled maintenance" is not included within the reliability calculation. So if the computer must be taken down to replace a failing disk, but the downtime is notified a week in advance, it "doesn't count".
See also
*
Reliability engineering External links
* [http://www.jiploo.com/blog/99999-or-five-nines-uptime/ 99.999 or 5 Nines Uptime Article]
* [http://basicstate.com/htm/9999.htm uptime values chart for values down to 90 percent on a daily basis]
Wikimedia Foundation. 2010.