failing like never before

25Jan/103

Intel FDIV Bug

A few years or so back, I put up a bunch of my high school and early college papers on this blog (they're under the "literature" category). Its a sad state of affairs when looking back my high school papers, that I realized my writing skills were significantly better back in high school. But anyways, heres a paper I wrote for my engineering ethics course. Its not my best work, and it certainly lacks the finish of my old high school stuff, but its passable.

To the Intel Corp. Board of Directors: A Post Mortem Report of the Pentium Flaw
Abstract

The floating point division flaw in the original Intel Pentium CPU, which resulted in some floating point division operations being calculated improperly, was a result of a few poor engineering decisions and while avoidable, was not condemnable. The subsequent decisions made by Intel executives, to keep the flaw hidden and then to downplay its importance, were however, morally flawed. While Intel executives adhered to a utilitarian ethical framework, they forgot to consider the impact their decisions would have on Intel’s public image. Had Intel executives followed a combination of rights and utilitarian ethics, where the rights of the customer are upheld while the company’s wellbeing is still valued, executives would have reached the correct decision, which was to offer a full “no questions asked” replacement policy at the very first discovery of the flaw.

The Pentium “FDIV Bug”

Given certain types of input data, the floating point division instructions on the original Intel Pentium CPU would generate slightly erroneous results. This result was dubbed by the public as the “FDIV Bug,” as one of the assembly language instructions affected by the bug was the FDIV instruction. Although Intel initially attempted to keep information regarding the flaw hidden, it eventually became public knowledge. The subsequent actions of Intel executives regarding their handling of the flaw were morally questionable and ultimately resulted in great damage being done to Intel’s public image. A different set of ethical frameworks would have allowed Intel executives to have reached the correct decision.
Using the basic Microsoft Windows calculator, a Pentium user could check for the presence of the flaw by performing the following calculation:

(4195835 * 3145727) / 3145727

The expected result of dividing a number by itself is one, so the equation above should yield a result of 4,195,835 but the flawed Pentium Floating Point Unit (FPU) produced a value of 4,195,579; an error of 0.006%. Not all calculations performed by the FDIV instruction on a Pentium CPU were incorrect however. The occurrence and degree of inaccuracy of the floating point division calculations were highly dependent on the input data and specific divide instruction used, and in most cases, the flaw was not apparent at all. According to Intel Corp., the flaw would only be encountered once every 27,000 years under normal use, although other groups have produced significantly different failure rates.
The “FDIV Bug” did not affect Intel CPUs predating the Pentium, as the flaw was a defect in a new algorithm that was intended to provide improved floating point performance over the Intel 486 (the predecessor to the Pentium). The Pentium used a new radix 4 SRT algorithm (named after its creators Sweeney, Robertson, and Tocher) in its floating point division operations, which required the use of a lookup table to improve calculation speed (Intel Corp. Section 4). This lookup table was generated prior to assembly and then loaded into a hardware Programmable Lookup Array (PLA) on the Pentium chip. However, the script which downloaded the lookup table into the PLAs had a bug in it that caused some lookup table entries to be omitted from the PLAs. Consequently, floating point division instructions that required the missing entries from the lookup table would produce erroneous values. This flaw has since been fixed and the “FDIV Bug” is no longer apparent in newer Intel CPUs.

The Pentium flaw should have been easily discoverable in early testing of the CPU, but there was also a mistake in Intel’s proofs for the Pentium FPU. Intel engineers attempted to simplify testing, and assumed that the sign (“+” or “-“) of a number doesn’t enter into division operations except in the last step. Thus, the proof for the Pentium only checked half of the PLA, and assumed (incorrectly) that the other half of the PLA was simply the mirror image of what was checked (Price P. 2). Unfortunately, the untested half of the PLA contained the missing entries. The two easily discoverable flaws, one in the PLA loading script and the other in the PLA proof, conspired to hide each other from Intel engineers so that the Pentium’s flaw was not discovered until after production of the CPU began.

Events Surrounding the Flaw

Intel Corp. discovered the flaw in the Pentium’s floating point unit through testing, in June of 1994 (after production of the chip), but chose to keep the information private instead of disclosing it to their customers (Markoff). Although Intel modified the design of the Pentium, the modified chips did not begin to reach the market until November of 1994, and the sales of flawed chips were not halted. Dr. Thomas R. Nicely of Lynchburg College also independently discovered the “FDIV Bug” in June of 1994 and attempted to bring it to the attention of Intel Corp. in October of that year, whereupon an Intel representative confirmed the existence of the flaw and then ceased to provide Dr. Nicely with any more information (Nicely). Nicely then proceeded to make the Pentium floating point unit’s flaw known to the public via e-mail, causing news of the Pentium flaw to spread quickly. Concerned Pentium owners who learned of the flaw were told by Intel that the flaw was inconsequential and that no replacement policy was being offered.

By late November, less than a month after Nicely’s first e-mail was sent out, the New York Times published an article entitled “Flaw Undermines Accuracy of Pentium Chips” and Intel’s mistake became almost common knowledge. At the same time, Intel began offering replacements for free to Pentium owners who could prove that they needed an unflawed chip to able to perform their work properly (Fleddermann P. 25). At this point in time, over two million flawed Pentium chips had been sold, and offering free replacements of all flawed Pentium chips would have cut sharply into Intel’s profit margin. However, most Pentium owners were unsatisfied and even angered by Intel’s replacement policy since very few computer users fulfilled Intel’s vague replacement requirements. Also, many Pentium owners felt that it was unfair for Intel to place customers under scrutiny (by requiring that a customer prove their need for a “perfect” chip), in order to resolve a situation created by Intel themselves.
Matters grew worse for Intel, when IBM claimed that their studies showed the Pentium’s flaw could be encountered as often as once every 24 days under normal usage, and that because of the flaw they would be suspending sales of Pentium based computers (Flynn). Public outcry continued to mount until finally, in December of 1994, Intel issued an apology to their customers and set aside $420 million to cover the costs of a “no questions asked” replacement program of all affected Pentium chips. Surprisingly enough, a relatively small number of Pentium owners actually made use of the replacement program.
Ethical Analysis

The Pentium “FDIV Bug” was a result of a few poor engineering decisions, and not a lapse in ethical judgment. However, Intel made numerous decisions following the discovery of the flaw that were ethically questionable. While making decisions regarding the Pentium flaw, managers at Intel appeared to have been following a utilitarian ethical framework. In a utilitarian ethical framework, the decision that produces the greatest good while minimizing harm to people is favored. Utilitarian ethics essentially attempts to “weigh” the positive and negative consequences of a decision.
When Intel first discovered the Pentium’s FPU flaw, they had the option of either making the flaw known or keeping it a secret. Intel managers apparently felt that the flaw was so hard to detect, that if they never told anyone, nobody would ever discover it. Of course, informing the public of the flaw would have cost Intel money as they would have been forced to provide replacements. Thus, Intel managers decided that by keeping the flaw secret nobody would be harmed because the public wouldn’t know, and Intel would benefit by not having to spend money replacing flawed chips. The thought process of Intel executives appears to be clear and logical, but unfortunately, they forgot certain key details.
When Nicely uncovered the Pentium FDIV flaw and spread the news of it across the Internet, Intel was faced with the choice of either apologizing for the flaw and offering replacements, or denying customers a replacement plan for flawed Pentium CPUs. Clearly, Intel executives felt that offering customers a replacement plan would only do “good” to a small portion of customers and would cost Intel a good deal of money. By refusing to replace flawed chips, Intel executives continued to believe that they were doing the most good for the most people (in this case, Intel) and doing little harm to customers since very few customers would be affected by the flaw. When the cries of the general public became more and more excited, Intel finally decided to offer a limited replacement policy to those people few people who would actually be affected the flaw. Because so many people were expressing their dislike of Intel’s policy, Intel executives must have felt that by offering a limited replacement policy they could improve the company’s public image, while costing the company a small amount of money and doing “good” to a small number of customers.

It was not until the voice of the public rose to a fury, that Intel executives realized that the company’s public image (and stock) were falling fast as a result of their handling of the Pentium flaw, and the only way to rectify the situation was to offer full, free replacements of affected Pentium chips. Although doing so would only do “good” to a small subset of customers actually affected, and would end up costing a great deal of money, Intel executives must have decided that the company’s public image was worth the money.

When following utilitarian frameworks, it is necessary to weigh out every single consequence of an action, making it a very difficult ethical standard to properly adhere to. Realizing all the consequences of an action and then determining the importance of a consequence is a difficult task, and one that people rarely get right every single time. Throughout their handling of the Pentium’s FPU flaw, Intel executives forgot (until the end at least) to consider how their decision would affect the company’s public image, and how valuable the company’s image was. Had Intel considered this from the start, they might have realized that hoping nobody would find the flaw was not worth the risk to their public image, and would have offered a full replacement plan immediately.

Because utilitarian ethics is such a difficult ethical framework, Intel executives should have adopted a set of ethical frameworks based on both utilitarian ethics and rights ethics. Under rights ethics, the action that protects and respects the rights of those affected, is the favored decision. A combination of the two ethical frameworks would have allowed Intel to still consider the good of the company while upholding the moral rights of their customers. Although it was well within Intel’s rights to deny customers a replacement plan, it is the right of a customer to know about any known flaws in a product they paid for. It is also the customer’s right to expect that the product they bought functions properly as advertised. Had Intel executives been adhering to a combination of utilitarian and rights ethics, they would have realized from the start that hiding the flaw was wrong, and that the moral action was to offer a replacement plan.
Clearly, the action that Intel executives should have taken following the discovery of the Pentium flaw was to immediately make the flaw known to the public, apologize for it, and offer a full replacement plan for affected chips. These actions would have preserved the company image, placated irate customers and resolved an issue for affected customers, at the cost of a fairly large sum of money. Intel would have been able to uphold the moral rights of their customers, while minimizing harm and maximizing the good done. As it turned out unfortunately, Intel was forced to offer a full replacement plan in the end, but were unable to completely salvage their company’s reputation.

Fleddermann, Charles B. Engineering Ethics. New Jersey: Prentice Hall, 1999.

Flynn, Laurie. "Intel Facing a General Fear of its Pentium Chip." The New York Times 19 Dec. 1994. Web. 13 Nov. 2009. http://www.khd-research.net/Tech/Computer/PBug/pbug_ny_times_3.txt.

Intel Corp. FDIV Replacement Program: White Paper Statistical Analysis of Floating Point Flaw. Tech. 30 Nov. 1994. Web. 13 Nov. 2009. http://www.intel.com/support/processors/pentium/fdiv/wp/.

Markoff, John. "Flaw Undermines Accuracy of Pentium Chips." The New York Times 24 Nov. 1994. Web. 13 Nov. 2009. http://www.nytimes.com/1994/11/24/business/company-news-flaw-undermines-accuracy-of-pentium-chips.html.

Nicely, Thomas R. "Bug in the Pentium FPU." 30 Oct. 1994. E-mail. http://www.emery.com/bizstuff/nicely.htm

Price, Dick. "Pentium FDIV Flaw - lessons learned." IEEE Micro 15.2 (1995). Web. 13 Nov. 2009.http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=372360&isnumber=8521.

Comments (3) Trackbacks (0)
  1. Thank you for posting this. You must take a lot of pride in your work. I’m also writing a paper on this myself. Now, because you came before me and put your work online I have to restructure sentences to avoid plagiarism and by the way that was a sarcastic thank you. Keep that shit to yourself I don’t need any more obstacles than you had throughout your academic career. Your C minus wasn’t good enough? Here’s a pat on the back. Now raise your awareness and do something more productive with your educated ass.

  2. So far as I can see, the only good Intel was concerned with was their own. This is not utilitarian ethics, but business ethics, otherwise known as plain, old fashioned, greed.

    If Inte wished to salvage the defective chips, they could have replaced them, in return for the customer returning the original defective chip to Intel. Intel could then have sold these defective chips with full disclosure of their flaw to whomsoever might be interested, at whatever price the market would decide.

    You say that “it was well within Intel’s rights to deny customers a replacement plan”, but is that truly the case?
    The customer has paid for a product that is demonstrably inaccurate. It is arguably no different from the case of a customer who exchanges a $100 bill for pennies, and finds that he has received only 9,999 cents. Surely, he has a legal claim against the person or company that has shortchanged him! The fact that the missing penny will hardly affect the customer’s use of his pocket change, even if factually true, will not stand up to legal scrutiny.
    In this case, since Intel cannot cure the defect, their only option was to reverse the transaction (though they might not be legally obligated to provide a new, replacement, chip; this could be relevant if the price had changed in the interim).

    If I were Intel’s legal counsel, I would have warned of a class-action suit brewing.


Reply

( Cancel )

Security Code:

Trackbacks are disabled.