Friday, January 26, 2007

Matz-san - buff that Ruby

Okay

When I started the new job, I was really excited about using Ruby. My boss is a quant and he really knows his stuffs. He loves to program in SAS. SAS is a very expensive and subscription based product. Any normal programmer would run away from SAS when you see the syntax. Actually, it is more like 'JCL'. [ If you don't know JCL - good for you ]. He does not care if I program in Perl, Python, or Ruby. So I decided to use Ruby.

I digress a little bit. Back to our gem.

I am going to get into trouble for saying this but Ruby is slow. Don't get me wrong - I love the language. However, sometimes, it is so slow that it drives me nuts.

For example, 'Date'. To print a 'Date' object is very expensive. VERY EXPENSIVE!.

I started to profile and poured over the profile output and found out that 'gcd' is called numerous and it is very slow.

After struggling, I wrote a "C" function to deal with it. But, I am still troubled of the fact that it is using 'Rational' number to represent 'Date'. It is beyond my brain why you want to use 'Rational' number for date.

However, Date#to_s is still killing me. Further digging into it, I found out that #to_s calls "strftime".

Guess what strftimeis doing - it recursively parse the format and prints out each individual components of 'Date'. Nothing wrong with that but you are parsing the format every single time you are executing Date#to_s.

What - parsing '%Y-%M-%D' - recursively calling it. You got to be kidding me!!!!

So one call to Date#to_s will result in scanning of "%Y-%M-%D" which translate into three recursive function calls.

Someone please hands me a gun.

So I changed it - if you don't pass in any format, it will use 'sprintf'.

class Date
alias old_strftime strftime

def strftime(fmt=nil)
if ( fmt == nil )
return sprintf("%.4d-%02d-%02d", year, mon, mday )
else
return old_strftime(fmt)
end
end
end


So now, Date#to_s is faster...

If I have time, I am going to hack 'Date' and find out why it is using 'Rational'.

rant mode off.

No comments: