failing like never before

9Jun/090

Network Timeouts in C

Reccently, while coding up some P2P sharing software for class, I came across a problem that really got me stuck. (Note, that I forked different processes to handle each upload and upload.) When reading data from another peer, my peer would generally have to block until the other peer responded, since with network programming we can never expect all of our requests to return immediately. The problem was that occassionally, the other peer would decide to die entirely and I was left with a process that would block essentially forever since the signal it was waiting for was never going to come. Now the great thing about blocking reads is that they don't burn CPU time spinning around in a circle waiting for data to arrive, but they do take up space in the process table and in memory. And of course, if my blocked reading proccesses stayed around forever, it would be very simple for a malacious peer to bring my OS to a grinding halt. Essentially, what I needed was a way to make read() timeout. Now anyone vaguely familar with using internet browsers and other such network-reliant programs are probably familar with timeouts, but I had no idea how to implement it in C.

My first thought, was to use setrlimit(), a Unix function that allows the programmer to set the maximum amount of system resources (CPU time, VM size, created file sizes, etc., for more use "man 2 setrlimit"). When setrlimit() is used to set a maximum amount of CPU time, the process will recieve SIGXCPU when the CPU time soft limit is reached, and then the process is killed when the hard limit is reached. At the time, I was a bit groggy so setrlimit() seemed like a great solution, but of course anyone with half a brain (which I apparently didn't have at the time) will realize that setrlimit() is definetely not the solution to this problem. A blocking process doesn't run and therefore doesn't consume CPU time, so setting a maximum CPU time does pretty much nothing to the blocking process; it'll still keep blocking forever.

After a little bit of trawling the internet, I finally came upon the perfect solution: alarms! When alarm(unsigned int seconds) is called, it will raise SIGALRM after so many seconds, realtime seconds mind you and not CPU seconds consumed by the process, even if the process that called alarm() is blocking. I set the alarm right before I began a read() and used signal() to bind a  signal handler to SIGALRM, so that when the alarm went off my signal handler could gracefully kill the timed-out download process!

So heres my code excerpt:

// I used a lot more libraries, but i think these
// are the ones needed for alarm()
#include
#include 

void timeout_download_handler(int sig)
{
    if(global_filename)
    {
        error("Peer timed out.
              Deleting file '%s' and killing download.\n",
              global_filename);
        unlink(global_filename);
    }
    else
        error("Peer timed out. Killing download.\n");
    exit(-1);
}

...

    // set alarm to kill this read if it takes to long
    signal(SIGALRM, timeout_download_handler);
    alarm(ALARM_TIME);
    // Read the file into the task buffer from the peer,
    // and write it from the task buffer onto disk.
    int ret = read_to_taskbuf(t->peer_fd, t);

...

Having finished now with telling my not so interesting tale of alarm(), I'd like to tell another more amusing tale.

I was in the engineering building finishing up the aforementioned CS homework late last Friday night; the homework was due at 11:59 PM. Down the hall from the computer lab, there was an extremely raccous and happy party going on. There were some mumumured complaints from the occupuants of the computer lab, as most of us felt that it was unfair that the rest of the nearby world wasn't joining in our last minute despair, and at one point someone even burst out angrily with, "Who the F*$# is having this much fun on a Friday night?" Implying somehow that normal people don't have fun parties on Friday nights.

At eleven PM just as I was about to check my program for the last time, we heard a strange buzzing noise emanating from far away, which as it turned out a few seconds later, was the fire alarm going off (extremely loudly) in the engineering building. After rushing outside of the building with my hands pressed against my ears, I sat down in the grass and flipped open my laptop. Unfortunately, I never installed the Cisco VPN client on my Linux client so I had no way of connecting to the school's official wireless network and therefore, couldn't connect to the tracker to check my P2P software or even turn in my homework. So for the next fifteen minutes, I scurried around the dark university campus, whipping out my laptop everytime I came to a building that I thought might have a non-university authorized wireless network. Eventually, I got a weak connection outside of the music building. So until 11:30 PM, I sat with my laptop pressed up against the glass walls of the music building, checking and submitting my homework.

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment


Security Code:

Trackbacks are disabled.