failing like never before

25Jan/103

Intel FDIV Bug

A few years or so back, I put up a bunch of my high school and early college papers on this blog (they're under the "literature" category). Its a sad state of affairs when looking back my high school papers, that I realized my writing skills were significantly better back in high school. But anyways, heres a paper I wrote for my engineering ethics course. Its not my best work, and it certainly lacks the finish of my old high school stuff, but its passable.

To the Intel Corp. Board of Directors: A Post Mortem Report of the Pentium Flaw
Abstract

The floating point division flaw in the original Intel Pentium CPU, which resulted in some floating point division operations being calculated improperly, was a result of a few poor engineering decisions and while avoidable, was not condemnable. The subsequent decisions made by Intel executives, to keep the flaw hidden and then to downplay its importance, were however, morally flawed. While Intel executives adhered to a utilitarian ethical framework, they forgot to consider the impact their decisions would have on Intel’s public image. Had Intel executives followed a combination of rights and utilitarian ethics, where the rights of the customer are upheld while the company’s wellbeing is still valued, executives would have reached the correct decision, which was to offer a full “no questions asked” replacement policy at the very first discovery of the flaw.

The Pentium “FDIV Bug”

Given certain types of input data, the floating point division instructions on the original Intel Pentium CPU would generate slightly erroneous results. This result was dubbed by the public as the “FDIV Bug,” as one of the assembly language instructions affected by the bug was the FDIV instruction. Although Intel initially attempted to keep information regarding the flaw hidden, it eventually became public knowledge. The subsequent actions of Intel executives regarding their handling of the flaw were morally questionable and ultimately resulted in great damage being done to Intel’s public image. A different set of ethical frameworks would have allowed Intel executives to have reached the correct decision.
Using the basic Microsoft Windows calculator, a Pentium user could check for the presence of the flaw by performing the following calculation:

(4195835 * 3145727) / 3145727

The expected result of dividing a number by itself is one, so the equation above should yield a result of 4,195,835 but the flawed Pentium Floating Point Unit (FPU) produced a value of 4,195,579; an error of 0.006%. Not all calculations performed by the FDIV instruction on a Pentium CPU were incorrect however. The occurrence and degree of inaccuracy of the floating point division calculations were highly dependent on the input data and specific divide instruction used, and in most cases, the flaw was not apparent at all. According to Intel Corp., the flaw would only be encountered once every 27,000 years under normal use, although other groups have produced significantly different failure rates.
The “FDIV Bug” did not affect Intel CPUs predating the Pentium, as the flaw was a defect in a new algorithm that was intended to provide improved floating point performance over the Intel 486 (the predecessor to the Pentium). The Pentium used a new radix 4 SRT algorithm (named after its creators Sweeney, Robertson, and Tocher) in its floating point division operations, which required the use of a lookup table to improve calculation speed (Intel Corp. Section 4). This lookup table was generated prior to assembly and then loaded into a hardware Programmable Lookup Array (PLA) on the Pentium chip. However, the script which downloaded the lookup table into the PLAs had a bug in it that caused some lookup table entries to be omitted from the PLAs. Consequently, floating point division instructions that required the missing entries from the lookup table would produce erroneous values. This flaw has since been fixed and the “FDIV Bug” is no longer apparent in newer Intel CPUs.

The Pentium flaw should have been easily discoverable in early testing of the CPU, but there was also a mistake in Intel’s proofs for the Pentium FPU. Intel engineers attempted to simplify testing, and assumed that the sign (“+” or “-“) of a number doesn’t enter into division operations except in the last step. Thus, the proof for the Pentium only checked half of the PLA, and assumed (incorrectly) that the other half of the PLA was simply the mirror image of what was checked (Price P. 2). Unfortunately, the untested half of the PLA contained the missing entries. The two easily discoverable flaws, one in the PLA loading script and the other in the PLA proof, conspired to hide each other from Intel engineers so that the Pentium’s flaw was not discovered until after production of the CPU began.

Events Surrounding the Flaw

Intel Corp. discovered the flaw in the Pentium’s floating point unit through testing, in June of 1994 (after production of the chip), but chose to keep the information private instead of disclosing it to their customers (Markoff). Although Intel modified the design of the Pentium, the modified chips did not begin to reach the market until November of 1994, and the sales of flawed chips were not halted. Dr. Thomas R. Nicely of Lynchburg College also independently discovered the “FDIV Bug” in June of 1994 and attempted to bring it to the attention of Intel Corp. in October of that year, whereupon an Intel representative confirmed the existence of the flaw and then ceased to provide Dr. Nicely with any more information (Nicely). Nicely then proceeded to make the Pentium floating point unit’s flaw known to the public via e-mail, causing news of the Pentium flaw to spread quickly. Concerned Pentium owners who learned of the flaw were told by Intel that the flaw was inconsequential and that no replacement policy was being offered.