Sunday, March 25, 2007

Sometimes print is bad

I wrote a simple Ruby script that, given a connection request on a particular port, opens a connection to a remote host on a particular port and forwards all packets from one to the other, a very simple packet forwarder. For debugging purposes, I was printing some text to standard out for each packet received / forwarded. I set up a simple performance test running the forwarder on my local system and generating packets using /dev/zero...something like this:

ruby forwarder.rb 20000 10000 localhost
nc -l -p 10000 -c "cat /dev/zero"
nc localhost 20000 > moo

After running this test for 15 seconds, the output file "moo" reached a size of 48MB on my laptop, indicating a throughput of over 3 MBps. Although this was sufficient for my purposes, it was much lower than I expected given that transfer should only be limited by the speed of my laptop harddrive in writing the output file.

By simply removing all "puts" calls for output, the performance increased dramatically. With the same 15 second run, the output file "moo" reached a size of 350MB on my laptop, indicating a throughput of over 23 MBps!

I suppose that when writing an application that handles a large amount of data, all debugging should obviously be turned off. However, it wasnt obvious to me that it could have such a large impact on performance. Hopefully this will help me not make the same mistake in the future.

Thursday, March 15, 2007

Ruby Trix

I've been having a number of problems in Ruby of late with error handling in Threads. In particular, any time I encounter a problem in a thread, I was given no indication that something went wrong.  It turns out that by default, an unhandled exception simply kills the current thread and you don't even hear about it unless you issue a "join" on the thread that raised the exception.  I suppose proper coding practice would be to handle my exceptions appropriately, but for debugging, it is nice to know when something has gone wrong.  The simple snippet of code:

Thread.abort_on_exception = true

takes care of this problem.  With this modification, any unhandled exception kills all running threads and yields readable error information.