failing like never before


Three Weeks in Biking

Distance Traveled: 13.497 miles
Average Speed: 12.9 MPH
Maximum Speed: 31.2 MPH
Elapsed Time: 1 hour, three minutes, and 12 seconds

Distance Traveled: 20.253 miles
Average Speed: 11.8 MPH
Maximum Speed: 31.2 MPH
Elapsed Time: 1 hour, forty-three minutes, and 10 seconds

Distance Traveled: 27.360 miles
Average Speed: 13.5 MPH
Maximum Speed: 31.2 MPH
Elapsed Time: 2 hours, two minutes, and 0 seconds

I really did a number on my two weeks ago. I took a shortcut through some bushes, which as it turned out were obscuring some rather large rocks, and somehow managed to derail my chain off both the front and back gears. After I reset the chain, I then noticed that the right pedal was spinning extremely freely; closer inspection revealed that the pedal was wobbling around on the axle and that the pedal housing was dropping bearings everywhere. When I got my bike back to my room and started to dissemble the pedal in hopes of repairing it, I noticed something quite odd.

It appeared that somehow, in the process of traveling through the bushes, I had managed to slam my pedal straight down into the ground, causing the plastic pedal housing to be pushed towards the bicycle body, which resulted in the pedal housing cracking down the center. Of course, the pedal was entirely non-rideable since putting much pressure on the pedal would make the pedal crack entirely in half.

So now I have a non-ridable bicycle. Go me.


The Evils of FAT

I think we're all familar with the infamous FAT32 (File Allocation Table) file system, still in use after all these years despite the numerous superior alternatives avaliable. I myself am guilty for helping to extend FAT's unnaturally long lifetime on this earth, since in order for me to be able to have an external hard drive that is easily mountable and readable by all the major operating systems (Linux, OS X, and Windows), I had to format the disk using a commonly used file system. Unfortunately, FAT32 was the only option avaliable. It would be nice if Microsoft and Apple started including by default, drivers for some modern file system in their operating systems so that all machines could easily share external media without suffering from a performance penalty that is inherent in the file system. But this is extremely unlikely, so I won't spend too much time hoping. But what exactly is it that makes Fat such a terrible system?

Anyone using FAT32 on their external hard drive used to store media will probably be familar with FAT32's maximum file size limitation of 4GiB, which is actually larger then the maximum file size allowed in the original FAT file system. FAT also has a linear seek time within files, since in order to find the ith sector used to store data in a file, we must first find all the sectors preceding that sector. And of course, FAT has some pretty serious internal fragmentation issues, especially once we start creating and deleting files on the drive. On a physical hard drive, large internal fragmentation of files results in the hard drive head having to constantly seek around the drive in order to find the next 4KiB block. This is of course a bad thing, especially since the bottleneck on a physical hard drive is its seek time. Ideally, we would like to have continguous blocks used for a single file.

Another serious performance issue with FAT that is not readily noticeable, is the size of the File Allocation Table, and how large it grows when with the drive. Ideally, we would like to be able to just stick the whole FAT into RAM in order to improve performance, since having to cache the FAT on disk would mean we would have to do an extra read in order to read the FAT just so we could find a file on the drive. Now, lets assume that you're a movie fanatic, and in order to store your increasingly large movie collection, you've bought a new 1TB (sorry, not 1 TiB) hard drive. Of course, you want your drive to be able to function with multiple operating systems so you format it with FAT32. FAT32 uses 4KiB blocks, and each entry in the File Allocation Table takes up 4 bytes. Because we have a 1TB drive, we have about 238 entries in the table, which means that the File Allocation Table on your 1TB drive is about 1GiB. Some people might consider this to be a problem.

Of course, a fairly modern desktop computer will have 4GB of RAM so it could theoretically store the whole FAT in memory without having to cache it on disk, but I doubt that the operating system would like to dedicate a quarter of its memory to improving IO performance on one external hard disk.

So there you have, another good reason to stay away from FAT.


Something is Not Quite Right

Take a look at this C function, and see if you can can catch whats wrong with it. (That is, what about this function could produce an error?)

void wait_for_ready(void)
    while ((inb(0x1F7) & 0x80) || !(inb(0x1F7) & 0x40))

This question showed up on my midterm and stumped the hell out of me at the time. Now that I know the answer, I feel like a complete idiot for not spotting the problem initially, seeing as its so amazingly obvious. When I took the exam, I spent way too much time on this problem, completely overthinking it.

In this function, we're using a machine level instruction, inb, to read the value of some hardware port on an x86 machines. So far so good, or so I thought when I was taking my exam. But the problem is, is that the data we're reading can be a shared resource, that is, other applications could be writing and reading to it at the same time, so a race condition ensues. Even on a machine with a single processor, this is still a problem, since wait_for_ready() could read the data memory into register, then a context switch could occur, some application could write to that location, and then wait_for_read() regains control but operates with an old data value.

So simple. And now I feel like an idiot.


GNU Readline

I reccently wrote a shell for a project in my CS class. One of the advanced features that my partner and I implmented in the shell, was tabbed completion, and in order to implemet this extremely useful shell feature, we used the GNU readline library. The GNU Readline library is a beast, and not in the good way. Its a great hulking pile of code and documentation, intended to provide a ton of features for reading typed lines. Once you figure out what all the function pointers, bindings, and generators in the Readline library are supposed to do, things become much more straightfoward, but it doesn't negate the fact that initially figuring out Readline is a bit of a pain in the butt.

The first thing I did after we were told that the Readline library could make our project easier to design, was to pop open a terminal and type "man readline". What I got was a basic summary of Readline, so in order to get the full library manual I had to resort to Google. I did however, happen to see this at the bottom of the manpage:


It's too big and too slow.

Now if even the guys working on the Readline library think thats its too big and too slow, we may have a potential problem on our hands.

One of the plus sides of however Readline's enormity was that it does offer a whole slew of features, like a kill ring, and generators for tab-completion of filenames and usernames. It would be very nice though, if all these features could be implemented without the need for a manual that probably took me longer to read, then it did for me to code up a generator for command line completion.

Tagged as: , , , No Comments

My Google Interview

Its been a while now, but here it is anyways...

About six or seven weeks ago, I scored a phone interview (technically two) with Google's IT department for a summer internship. Nobody was more surprised then I was that Google actually found my resume somewhat impressive enough to warrent a phone interview, especially considering my less-then-stellar GPA and the enormous number of super-intelligent applicants Google recieves every day. The two interviews were each forty-five minutes long, and the interviewers (both intelligent IT guys,  and not technically incompetent manager types) took pretty much all of the alloted time.

These days, almost every CS guy dreams of working for Google and so I've heard a few things about their interviews before, which I would like to mention before I get into my interview. A few years ago when I was an intern at Intel, they had a lady come in to tell all the high school interns about how to be successful in scoring future jobs. She spent a lot of time teaching us how to walk properly, shake hands, sit in a proper manner, dress, and answer generic interview questions. She told us that Google interviewers like to ask broad open ended questions like "how would you sell ice to an eskimo" and "why are manhole covers round," and promptly put us to answering similar questions. A few months later, I went to a Google Tech Talk at my University, where a Google software engineer was asked by someone in the audience if Google did in fact like to ask interview questions like "why are manhole covers round." The Google rep resonded with the following:

"In my time at Google I have interviewed several software engineers and I have never asked a question like that before. Google is not in the business of making manhole covers. If we did make manhole covers, we might ask those kinds of questions."

I think occassionally, a Google interviewer might throw in a brain teaser if they just want to burn some time, but apparently they don't do it too often.

Anyways, going on to my interview... I was interviewing for an IT position, so unlike the software developer positions where they barrage you with an endless stream of algorithim and programming questions (Why is quicksort log(n)? Whats the best sorting algorithim to use in this scenario? What data structure would you use for this? etc.) there was almost no programming invovled in my interview. And since the recruiter and HR person told me pretty much nothing about what I should expect, I went into the interview pretty much cold.