Intel FDIV Bug
A few years or so back, I put up a bunch of my high school and early college papers on this blog (they're under the "literature" category). Its a sad state of affairs when looking back my high school papers, that I realized my writing skills were significantly better back in high school. But anyways, heres a paper I wrote for my engineering ethics course. Its not my best work, and it certainly lacks the finish of my old high school stuff, but its passable.
To the Intel Corp. Board of Directors: A Post Mortem Report of the Pentium Flaw
Abstract
The floating point division flaw in the original Intel Pentium CPU, which resulted in some floating point division operations being calculated improperly, was a result of a few poor engineering decisions and while avoidable, was not condemnable. The subsequent decisions made by Intel executives, to keep the flaw hidden and then to downplay its importance, were however, morally flawed. While Intel executives adhered to a utilitarian ethical framework, they forgot to consider the impact their decisions would have on Intel’s public image. Had Intel executives followed a combination of rights and utilitarian ethics, where the rights of the customer are upheld while the company’s wellbeing is still valued, executives would have reached the correct decision, which was to offer a full “no questions asked” replacement policy at the very first discovery of the flaw.
The Pentium “FDIV Bug”
Given certain types of input data, the floating point division instructions on the original Intel Pentium CPU would generate slightly erroneous results. This result was dubbed by the public as the “FDIV Bug,” as one of the assembly language instructions affected by the bug was the FDIV instruction. Although Intel initially attempted to keep information regarding the flaw hidden, it eventually became public knowledge. The subsequent actions of Intel executives regarding their handling of the flaw were morally questionable and ultimately resulted in great damage being done to Intel’s public image. A different set of ethical frameworks would have allowed Intel executives to have reached the correct decision.
Using the basic Microsoft Windows calculator, a Pentium user could check for the presence of the flaw by performing the following calculation:
(4195835 * 3145727) / 3145727
The expected result of dividing a number by itself is one, so the equation above should yield a result of 4,195,835 but the flawed Pentium Floating Point Unit (FPU) produced a value of 4,195,579; an error of 0.006%. Not all calculations performed by the FDIV instruction on a Pentium CPU were incorrect however. The occurrence and degree of inaccuracy of the floating point division calculations were highly dependent on the input data and specific divide instruction used, and in most cases, the flaw was not apparent at all. According to Intel Corp., the flaw would only be encountered once every 27,000 years under normal use, although other groups have produced significantly different failure rates.
The “FDIV Bug” did not affect Intel CPUs predating the Pentium, as the flaw was a defect in a new algorithm that was intended to provide improved floating point performance over the Intel 486 (the predecessor to the Pentium). The Pentium used a new radix 4 SRT algorithm (named after its creators Sweeney, Robertson, and Tocher) in its floating point division operations, which required the use of a lookup table to improve calculation speed (Intel Corp. Section 4). This lookup table was generated prior to assembly and then loaded into a hardware Programmable Lookup Array (PLA) on the Pentium chip. However, the script which downloaded the lookup table into the PLAs had a bug in it that caused some lookup table entries to be omitted from the PLAs. Consequently, floating point division instructions that required the missing entries from the lookup table would produce erroneous values. This flaw has since been fixed and the “FDIV Bug” is no longer apparent in newer Intel CPUs.
The Pentium flaw should have been easily discoverable in early testing of the CPU, but there was also a mistake in Intel’s proofs for the Pentium FPU. Intel engineers attempted to simplify testing, and assumed that the sign (“+” or “-“) of a number doesn’t enter into division operations except in the last step. Thus, the proof for the Pentium only checked half of the PLA, and assumed (incorrectly) that the other half of the PLA was simply the mirror image of what was checked (Price P. 2). Unfortunately, the untested half of the PLA contained the missing entries. The two easily discoverable flaws, one in the PLA loading script and the other in the PLA proof, conspired to hide each other from Intel engineers so that the Pentium’s flaw was not discovered until after production of the CPU began.
Events Surrounding the Flaw
Intel Corp. discovered the flaw in the Pentium’s floating point unit through testing, in June of 1994 (after production of the chip), but chose to keep the information private instead of disclosing it to their customers (Markoff). Although Intel modified the design of the Pentium, the modified chips did not begin to reach the market until November of 1994, and the sales of flawed chips were not halted. Dr. Thomas R. Nicely of Lynchburg College also independently discovered the “FDIV Bug” in June of 1994 and attempted to bring it to the attention of Intel Corp. in October of that year, whereupon an Intel representative confirmed the existence of the flaw and then ceased to provide Dr. Nicely with any more information (Nicely). Nicely then proceeded to make the Pentium floating point unit’s flaw known to the public via e-mail, causing news of the Pentium flaw to spread quickly. Concerned Pentium owners who learned of the flaw were told by Intel that the flaw was inconsequential and that no replacement policy was being offered.
Knight’s Tour in Lisp
For those of you unfamilar with the Knight's tour, here's a brief overview of the problem. The Knight's Tour invovles placing a knight on a chessboard and then having the knight move to each and every square on the chessboard once and only once. There are a suprisingly huge number of possible paths that a single Knight can take on a standard chessboard so of course, a blind, exhaustive search is generally out of the question.
The program I have written below explores the Open Knight's Tour, where the knight simply has to reach each and every square once and only once (the Closed Tour would add the additional requirement that the knight end on the square he started on). I will admit that my Lisp code isn't exactly the best or the most efficient, but it works fairly deccent. My method of solving the tour, uses backtracking (which is very similar to Depth First Search) and a herustic known as Minimum Remaining Value. This herustic is sometimes known as Warnsdorff's algorithm when used in the Knight's Tour. Warnsdorff's algorithm selects as the knight's next move, the square that would present the fewest possible moves following the move to that square. Essentially, the herustic is trying to eliminate dead end's earlier on. And so here's my code (most of which is comments, [Hey, it was written for a class]):
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; FIND-KNIGHTS-TOUR2 ;;; ;;; using minimum remaining value herustics (aka most constrained variable) ;;; ;;; ;;; ;;; on my laptop, this implementation can find a solution to ;;; ;;; 100x100 board, starting from (1 1) in under three minutes ;;; ;;; and can solve 50x50 in about 16 seconds on SEAS ;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; you can run the knights tour with: ; (find-knights-tour ([rows in board] [cols in board] [row_pos] [col_pos]) (setq steps 0) (setq visited-list ()) ; generate a list of all possible moves for knight ; given that current position of knight is (row col) ; note that this function will generate "wrong" move values, like ; moving off the board or moving to an old spot. ; ; function get_values(number:x number:y):((num num) (num num)) (defun get_values (row col) (list ; there are at most eight possible move values (list (+ row 2) (+ col 1)) (list (+ row 2) (- col 1)) (list (+ row 1) (+ col 2)) (list (+ row 1) (- col 2)) (list (- row 2) (+ col 1)) (list (- row 2) (- col 1)) (list (- row 1) (+ col 2)) (list (- row 1) (- col 2))) ) ; val is a list of possible move values for a knight variable ; and should be generated by get_values function ; m_row is max number of rows in the board ; m_col is max number of columns in the board ; ; test_valid_values will return a list containing only the valid ; values allowed, removing those values that have already been ; visited and are not on the board ; ; function test_valid_values( ; ((num num) (num num) ):vals ; number:m_row ; number:m_col):((num num) (num num) ) (defun test_valid_values (vals m_row m_col) (let ; grab the first value out of the vals list ((v (car vals))) (cond ((null vals) ()) ; test for out of bound values, or already traversed ; if value is out of bound, remove it (cdr vals) ((or (> (car v) m_row) (<= (car v) 0) (> (cadr v) m_col) (<= (cadr v) 0) (list_member v visited-list)) (test_valid_values (cdr vals) m_row m_col)) ; value is valid, add to our "valid" list (t (cons (car vals) (test_valid_values (cdr vals) m_row m_col))))) ) ; check if "l" is a member of "visited" ; return 't' if so, 'nil' otherwise ; ; function list_member( ; (number number):l ; ((num num) (num num) ):visited) :t/nil (defun list_member (l visited) (cond ((null visited) nil) ((equal (car visited) l) t) (t (list_member l (cdr visited)))) ) ;minimum remaining value herustic, or the most constrained variable ;instead of just returning randomnly arranged list of possible moves from ; current location (as in above without herustics) arrange possible moves ; in the list so that the move that will offer the least number of moves in ; the next iteration comes first, with the move that offers the most of ; moves coming last in the list ; ;function min_remaining_val(number:m_row number:m_col): ; ((number number) (number number) ) (defun min_remaining_val (m_row m_col) (let ; get valid values ((vals (test_valid_values (get_values (car (car (last visited-list))) (cadr (car (last visited-list)))) m_row m_col))) ; then call call mrv_helper to attach assign a value to each move ; mrv_sort to arrange moves (mrv_sort (mrv_helper vals m_row m_col)) )) ; determines number of possible moves allowed if we move to a position in ; the vals list ; ; function mrv_helper( ; ((num num) (num num) ):vals ; number:m_row ; number:m_col) : ; ((num (num num)) (num (num num)) ) (defun mrv_helper (vals m_row m_col) (let ((len ; calculate the number of possible moves from a position in the ; "vals" list (length (test_valid_values (get_values (car (car vals)) (cadr (car vals))) m_row m_col)))) (cond ; if no more moves in "vals", just return it and done ((null (cdr vals)) (list (cons len (list (car vals))))) ; else add move and length value to list and keep recursing (t (cons (cons len (list (car vals))) (mrv_helper (cdr vals) m_row m_col))))) ) ;finds the move with the minimum value in the list "vals" ;i.e. if "vals" is ( (1 (2 4)) (2 (5 2)) (4 (3 2)) ) ; then min_vals would return (1 (2 4)) because its move value is "1" ; ;function min_val( ; ((num (num num)) (num (num num)) ):vals) : ; (num (num num)) (defun min_val (vals) (cond ; base case, return the only move in "vals" ((null (cdr vals)) (car vals)) (t (let ; calculate the min_val in the rest of the list ; store it in "p" ((p (min_val (cdr vals)))) (cond ; if the current minimum value is less then previous ; min val, then use current min val ((<= (car (car vals)) (car p)) (car vals)) ; else use previous min val (t p))))) ) ; remove from list 'l' the element that matches 'r' ; this is because we are not allowed to use "remove" function ; ;function remove_list(list:r list:l):list (defun remove_list (r l) (cond ; if equal, then done ((equal r (car l)) (cdr l)) ; reached the end and couldn't find a match ((null (cdr l)) l) ; else keep recursing (t (cons (car l) (remove_list r (cdr l))))) ) ; sort the moves in "vals" list based on their move length values ; and then strips out their "length values ; ; i.e., if "vals" is ((3 (2 3)) (2 (1 2)) (5 (3 4)) (1 (2 1)) (2 (2 2))) ;then mrv_sort returns: ; ((2 1) (1 2) (2 2) (2 3) (3 4) ) ;function mrv_sort( ; ((num (num num)) (num (num num)) ):vals) : ; ((num num) (num num) ) (defun mrv_sort (vals) (cond ; base case, return the last move in "vals" ((null (cdr vals)) (cdr (car vals))) (t (let ;else keep sorting, using min_val function ;v is the min val in "vals" ((v (min_val vals))) (append (cdr v) (mrv_sort (remove_list v vals)))))) ) ; recurse down the search tree, expand out the children for last ; visited spot ; ; m_row is number of rows in board ; m_col is number of cols in board ; ; function recurse_down(number:m_row number:m_col):t/nil (defun recurse_down (m_row m_col) ; get the valid values, store as val (let ((vals (min_remaining_val m_row m_col))) (cond ; no valid values, backtrack ; note, that we test for sucessful completion before this ((null vals) nil) ; else recurse across (t (recurse_across vals m_row m_col)))) ) ; this function is exactly the same as recurse_across except ; that it calls recurse_down instead of recurse_down ; ;vals is valid moves, having the form: ; ((number number) (num num) etc.) ; m_row is number of rows in board ; m_col is number of cols in board ; ; function recurse_across( ; ((num num) (num num) ):vals ; number:m_row ; number:m_col) : t/nil (defun recurse_across (vals m_row m_col) (cond ; no more valid moves, backtrack ((null vals) nil) (t (cond ; if number of steps taken equals number of squares on board ; then we're done ((equal (+ steps 1) (* m_row m_col)) ; increment number of steps, add new visited square (setq steps (+ steps 1)) (setq visited-list (append visited-list (list (car vals)))) ; return t t) (t ; else continue moving until end ; increment steps and add to visited-list (setq steps (+ steps 1)) (setq visited-list (append visited-list (list (car vals)))) (cond ; recurse down, return t if recurse_down found solution ((recurse_down m_row m_col) t) (t ; if recurse_down did not find solution, backtrack, removing ; moves from visited-list and from "steps" as we backtrack ; and then recurse_across (setq steps (- steps 1)) (setq visited-list (reverse (cdr (reverse visited-list)))) (recurse_across (cdr vals) m_row m_col))))))) ) ;find the knights tour given board size of (m_row m_col) and ;starting position (row col) ;use herustics ; ;function find-knights-tour(number:m_row number:m_col number:row number:col) : ; ((number number) (number number) ) (defun find-knights-tour (m_row m_col row col) ; set the global variables (setq visited-list (list (list row col))) (setq steps 1) ; start searching... (cond ((recurse_down m_row m_col) visited-list) (t nil)) )
Failing Like Never Before
When I created this blog, I gave it the sub-heading "failing like never before," although at the time I had no idea how apt the name would become. You see, I recently come to realize while reflecting on my life, that my intellectual capacity and mental abilities peaked at around the beginning of eighth grade, and that since then I've been on a steady and slow decline. Every day of my life is, in short, more of a failure then the day that preceded it.Therefore, I am indeed failing like never before.
My grades in college have been lackluster and far from impressive, my transcript is peppered with Cs and there is a noticeable dearth of strong achievements on my resume. Once upon a time, a long time ago, I was a bright and attentive student who was always "on top of the ball." But now I have trouble focusing in class and cannot seem to bring myself to care as much about studying as I once did.
So my university has a policy that students in the school of engineering cannot drop a class after the Friday of the fourth week of the quarter. But a few weeks ago, just a day after the drop deadline for classes had passed, I suddenly realized that I was in some pretty deep trouble. As a result of my procrastination and difficulties in focusing on studying, I was severely behind in my circuits class and my midterm was just a few days away. Meanwhile, I still had a lot of work to do on my A.I. homework. Long story short, I couldn't drop my circuits class so I ended up flunking the midterm (although I did finish my A.I. homework and did a pretty good job on it). No matter how badly I've ever done a test in high school or college, I've never failed a test before, at least up until now. I was so depressed, that after the midterm, instead of riding my bike home I ended up riding for over an hour in a straight line away from campus; I think a part of me just wanted to run away from my problems.
There was a time when I could have banged out a seven page paper in a few hours, assuming that I knew enough about the subject matter. But a few weeks ago, I ended up staying up until four in the morning writing a seven page paper on air pollution. FOUR IN THE MORNING!!! And it wasn't even a well written paper at that. The worst part, was that I then proceeded to sleep through my class (which was from 9-12) and ended up turning my paper in late. When I was in high school, I woke up at 6:30 sharp everyday, regardless of what time I had gone to sleep or what day of the week it was. And up until this year of college, I had never turned in a late paper.
More on the failures of my life to come later...
Network Timeouts in C
Reccently, while coding up some P2P sharing software for class, I came across a problem that really got me stuck. (Note, that I forked different processes to handle each upload and upload.) When reading data from another peer, my peer would generally have to block until the other peer responded, since with network programming we can never expect all of our requests to return immediately. The problem was that occassionally, the other peer would decide to die entirely and I was left with a process that would block essentially forever since the signal it was waiting for was never going to come. Now the great thing about blocking reads is that they don't burn CPU time spinning around in a circle waiting for data to arrive, but they do take up space in the process table and in memory. And of course, if my blocked reading proccesses stayed around forever, it would be very simple for a malacious peer to bring my OS to a grinding halt. Essentially, what I needed was a way to make read() timeout. Now anyone vaguely familar with using internet browsers and other such network-reliant programs are probably familar with timeouts, but I had no idea how to implement it in C.
My first thought, was to use setrlimit(), a Unix function that allows the programmer to set the maximum amount of system resources (CPU time, VM size, created file sizes, etc., for more use "man 2 setrlimit"). When setrlimit() is used to set a maximum amount of CPU time, the process will recieve SIGXCPU when the CPU time soft limit is reached, and then the process is killed when the hard limit is reached. At the time, I was a bit groggy so setrlimit() seemed like a great solution, but of course anyone with half a brain (which I apparently didn't have at the time) will realize that setrlimit() is definetely not the solution to this problem. A blocking process doesn't run and therefore doesn't consume CPU time, so setting a maximum CPU time does pretty much nothing to the blocking process; it'll still keep blocking forever.
After a little bit of trawling the internet, I finally came upon the perfect solution: alarms! When alarm(unsigned int seconds) is called, it will raise SIGALRM after so many seconds, realtime seconds mind you and not CPU seconds consumed by the process, even if the process that called alarm() is blocking. I set the alarm right before I began a read() and used signal() to bind a signal handler to SIGALRM, so that when the alarm went off my signal handler could gracefully kill the timed-out download process!
Something is Not Quite Right
Take a look at this C function, and see if you can can catch whats wrong with it. (That is, what about this function could produce an error?)
void wait_for_ready(void) { while ((inb(0x1F7) & 0x80) || !(inb(0x1F7) & 0x40)) continue; }
This question showed up on my midterm and stumped the hell out of me at the time. Now that I know the answer, I feel like a complete idiot for not spotting the problem initially, seeing as its so amazingly obvious. When I took the exam, I spent way too much time on this problem, completely overthinking it.
In this function, we're using a machine level instruction, inb, to read the value of some hardware port on an x86 machines. So far so good, or so I thought when I was taking my exam. But the problem is, is that the data we're reading can be a shared resource, that is, other applications could be writing and reading to it at the same time, so a race condition ensues. Even on a machine with a single processor, this is still a problem, since wait_for_ready() could read the data memory into register, then a context switch could occur, some application could write to that location, and then wait_for_read() regains control but operates with an old data value.
So simple. And now I feel like an idiot.
Request for a Change
Hauling out my change of major request essay...
Prior to UCLA my knowledge of anything vaguely associated to the field of electrical engineering was essentially nil, so it should come as no surprise when I say that I cannot for the life of me begin to remember why I chose electrical engineering as a major. And now, the more electrical courses I take, the more tiresome I find the subject matter to be. So the question I should be answering isn't "why do I want to change majors," but rather, "why did I wait till now to change?"
Over the last two years, I have trudged my way through the required introductory math and physics courses, never particularly excited by the material but always telling myself that it was going to get better when I reached the EE courses and that then, everything would be worth it. So it came as somewhat of a surprise to me, that I found the electromagnetics in EE 1 to be as unexciting as the materials in all the prerequisite courses. But after having spent the better part of two quarters convincing myself that "it was going to get better," I continued to drag myself through another uninteresting class, and during my second year, I proceeded to take EE 2, 16, 116L, 10, and 102, continually hoping that things would get better. Happily, they did.
It was during second quarter of this year while I was doing EE homework one night, that I realized that the only EE classes that I have ever actually enjoyed were the multi-departmental classes (16, and 116L) shared by the EE and CS departments, and that CS 35L (informally called “intro to Linux) was without a doubt my most enjoyable class ever. Shocked by my newfound epiphany, I stood up from my desk and declared to my roommates that I did not like electrical engineering and that instead, computer science was the field for me. My roommates were unsurprised by my sudden realization, saying that they had always noticed that I liked computers more than electrical engineering, and that the energetic mannerisms that I adopted when talking about Linux or computer architecture deeply contrasted the dull and tiresome attitude I exhibited when I approached my EE homework. Indeed, during my past two years at UCLA, I have probably spent as much time tinkering with Linux and reading essays like Neal Stephenson’s In the Beginning was the Command Line, and Fred Brooks’ The Mythical Man-Month: Essays on Software Engineering, as I have dedicated to my EE classes.
I am, however, not choosing to switch into computer science, but rather computer science and engineering since my enthusiasm for all things computer related extends below (so to speak) the software level. Now lately, numerous people have pointed out to me the similarities between electrical engineering with the computer engineering option, and computer science and engineering, and asked why I simply do not stay with EE CE. A question to which I can only respond to by saying that the differences between the two majors is quantitatively quite small (That is to say, the difference in course requirements between the two is almost negligible.), but that the content of the almost negligible difference is quite significant to me; I would much rather take a compiler class rather than another electromagnetics course.
Having now realized what subjects I enjoy the most, I am loath to continue in a major I find no pleasure in. Though perhaps over-dramatic, it would not be inaccurate to say that this change of major could have the potential to dictate my future happiness.
Muxing the Clock
My final lab project for my digital design class is basically a free-for-all, design your own project. The project that my lab partner and I have chosen includes displaying a countdown timer on four seven-segment displays. We wanted to have the displayed value decrementing once every second, and we also wanted to make it so that the user could pause the counter and change the displayed value with push-buttons. The only problem with this scenario is that since the counters are connected to a 1Hz clock, the speed at which a user can set the displayed values is also 1Hz. This is bad, as we quickly found out, because it is unbearably slow, especially when you want to set a value of 100. So I came up with what I thought was a great idea: mux the clock input on the counters and then use the "set" button as the mux selector. If one of the mux inputs was then a 1Hz signal and the other a 4Hz clock, we'd be able to switch back and forth between the clock signals at will. Like most things I come up with, I thought it was a brilliant idea. And like most of my ideas, it died in the implementation phase.
The results of muxing the clock signal resulted in a buggy counter that would often spaz out whenever the "set" button was pressed. The clock would settle back into a regular pattern after one or two cycles, but the damage would have already been done: nobody wants to have a clock timer that explodes in a mad counting rush every time you press a button. I believe the reason for the bugginess is due to the fact that the rising and falling edges of the 1Hz and 4Hz clock do not coincide in time perfectly, and so when the mux switches clocks the counters see a clock signal that fluctuates at a weird rate for a brief interval.
And one more thing that I've learned in digital design lab: Xilinx software is total crap. Project files get regularly corrupted, the program crashes at odd times, compiling takes so long that I could probably wire the design by hand faster, and the auto-wirer was designed by a disgruntled employee on crack. The quiet murmuring that fills the digital design lab room is frequently broken by loud exclaimations of "I hate this software," "I will find the engineer that designed this auto-wirer and kill him," "#$(@*$&(@!***#&!!" and other such phrases (generally not by me). I could maybe understand the total crappiness if Xilinx's ISE was a beta version and avaliable for free, but its not. We're running version 9.2 (somewhere around there) and the school paid buttloads for Xilinx's POS software. Please don't even bother with Xilinx.