- Therac-25
The Therac-25 was a
radiation therapy machine produced byAtomic Energy of Canada Limited (AECL) andCGR MeV ofFrance after theTherac-6 andTherac-20 units. It was involved with at least six accidents between 1985 and 1987, in which patients were given massive overdoses of radiation, approximately 100 times the intended dose. [Baase 2008, p.425.] Three of the six patients died. These accidents highlighted the dangers of software control of safety-critical systems, and they have become a standard case study inhealth informatics .Problem description
The machine offered two modes of
radiation therapy :
* Direct electron-beam therapy, which delivered low doses of high-energy (5 MeV to 25 MeV) electrons over short periods of time;
* Megavolt X-ray therapy, which deliveredX-ray s produced by colliding high-energy (25 MeV) electrons into a "target".When operating in direct electron-beam therapy mode, a low-powered electron beam was emitted directly from the machine, then spread to safe concentration using scanning magnets. When operating in megavolt X-ray mode, the machine was designed to rotate four components into the path of the electron beam: a target, which converted the electron beam into X-rays; a flattening filter, which spread the beam out over a larger area; a set of movable blocks (also called a
collimator ), which shaped the X-ray beam; and an X-ray ion chamber, which measured the strength of the beam.The accidents occurred when the high-power electron beam was activated instead of the intended low power beam, and without the beam spreader plate rotated into place. The machine's software did not detect that this had occurred, and therefore did not prevent the patient from receiving a potentially lethal dose of radiation. The high-powered electron beam struck the patients with approximately 100 times the intended dose of radiation, causing a feeling described by patient Ray Cox as "an intense electric shock". It caused him to scream and run out of the treatment room.Set Phasers On Stun - Design and Human Error, Steven Casey, pp. 11-16] Several days later,
radiation burns appeared and the patients showed the symptoms of radiation poisoning. In three cases, the injured patients died later fromradiation poisoning .The software flaw is recognized as a
race condition .Root causes
Researchers who investigated the accidents found several contributing causes. These included the following "institutional" causes:
*AECL did not have the software code independently reviewed.
*AECL did not consider the design of the software during its assessment of how the machine might produce the desired results and what failure modes existed. These form parts of the general techniques known asreliability modeling andrisk management .
*The system noticed that something was wrong and halted the X-ray beam, but merely displayed the word "MALFUNCTION" followed by a number from 1 to 64. The user manual did not explain or even address the error codes, so the operator pressed the P key to override the warning and proceed anyway.
*AECL personnel, as well as machine operators, initially did not believe complaints. This was likely due to overconfidence. [Baase 2008, p.428.]
*AECL had never tested the Therac-25 with the combination of software and hardware until it is assembled at the hospital.
*In one of the accidents, both the intercom and video monitor were broken. The operator could not tell that the patient was in trouble until the patient desperately pounded on the door. [Baase 2008, p.429.]The researchers also found several "
engineering " issues:
*The failure only occurred when a particular nonstandard sequence of keystrokes was entered on theVT-100 terminal which controlled thePDP-11 computer: an "X" to (erroneously) select 25MV photon mode followed by "cursor up", "E" to (correctly) select 25 MeV Electron mode, then "Enter". This sequence of keystrokes was improbable, and so the problem did not occur very often and went unnoticed for a long time.Set Phasers On Stun - Design and Human Error, Steven Casey, pp. 11-16]
*The design did not have any hardware interlocks to prevent the electron-beam from operating in its high-energy mode without the target in place.
*The engineer had reused software from older models. These models had hardware interlocks that masked their software defects. Those hardware safeties had no way of reporting that they had been triggered, so there was no indication of the existence of faulty software commands.
*The hardware provided no way for the software to verify that sensors were working correctly (see "open-loop controller "). The table-position system was the first implicated in Therac-25's failures; the manufacturer revised it with redundant switches to cross-check their operation.
*The equipment control task did not properly synchronize with the operator interface task, so thatrace condition s occurred if the operator changed the setup too quickly.Clarifyme|date=October 2007 This was evidently missed during testing, since it took some practice before operators were able to work quickly enough for the problem to occur.
*The software set a flag variable by incrementing it. Occasionally anarithmetic overflow occurred, causing the software to bypass safety checks.ee also
*
Software bug
*Race condition
*Nuclear and radiation accidents Notes
References
*Baase, S (2008). "A Gift of Fire", Pearson Prentice Hall.
External links
* [http://sunnyday.mit.edu/papers/therac.pdf The Therac-25 Accidents (PDF)] , by
Nancy Leveson (the updated version of the IEEE Computer article mentioned below)
* [http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_1.html An Investigation of the Therac-25 Accidents (IEEE Computer)]
* [http://neptune.netcomp.monash.edu.au/cpe9001/assets/readings/www_uguelph_ca_~tgallagh_~tgallagh.html Short summary of the Therac-25 Accidents]
Wikimedia Foundation. 2010.