Retro-reading RailsSpace: Symbols vs. Strings

I’ve started reading RailsSpace by Michael Hartl and Aurelius Prochazka. This 600 page book serves as a tutorial of Ruby on Rails that walks through building a social network, and you get to learn both virtually from scratch.

I put off reading this book for a couple reasons. It’s slightly outdated, and I’m not a beginner. But it’s a classic, and I was sure I’d pick up some new tricks along the way. I was right. I plan to keep reading, and chronical the things I learn.

Like most Ruby on Rails developers, I learned Ruby through Rails. While I’ve devoted more time to studying Ruby itself lately, it’s no surprise that my first lesson is Ruby-based. Symbols and Strings can often seem interchangeable, and convention tells us when it’s better to use one or the other. But now I know why.

Symbols are more efficient than strings, especially as hash keys. They lack most of the methods that weigh down a String object. In fact, a Ruby Symbol has just 12 documented methods – most of which handle conversion to a different format (#to_i, #to_s, etc). Strings, on the other hand, have over 10 times as many.

Symbol comparison is particularly speedy because symbols are added to a special (non-database) table, and looked up from there. Strings, on the other hand, compare character-by-character to determine if there is a match.

Let’s check it out in irb:

irb(main):001:0> "test".object_id
=> -605722338
irb(main):002:0> "test".object_id
=> -605736908
irb(main):003:0> :test.object_id
=> 83778
irb(main):004:0> :test.object_id
=> 83778

Aha! Two anonymous strings (not assigned to anything) are actually different objects, and their contents have to be compared to see if they’re equivalent. Calling :test, however, points to the same object no matter where it appears in your code.

So all Ruby has to do is verify if the two symbols being compared are, in fact, the same object. Comparing symbols is more like comparing integers than comparing strings, and that translates to speed in hash lookups, among other things.

If you’re really bored, you can read my unscientific test below.

Unscientific Test

I loaded one hash with 10,000 string keys (‘jaime1’ through ‘jaime10000’) and another hash with 10,000 symbol keys (:jaime1 through :jaime10000). Doing 3 million hash lookups took 14 cursor blinks on the hash with string keys. The same lookups took only 9.5 cursor blinks with symbol keys. See, I told you it wasn’t very scientific, but it did show about 33% speedier lookups with symbols.

Here’s the irb session:

# loading first hash with string keys #
string_hash = {}
10000.times {|i| string_hash["jaime#{i}"] = "jaime#{i}"}

# loading second hash with symbol keys #
symbol_hash = {}
10000.times {|i| symbol_hash[:"jaime#{i}"] = "jaime#{i}"}

# 3 milllion lookups on string-keyed hash: 14 cursor blinks #
1000000.times { string_hash['jaime1977']; string_hash['jaime429']; string_hash['jaime5000'] }

# 3 million lookups on symbol-keyed hash: 9.5 cursor blinks #
1000000.times { symbol_hash[:jaime1977]; symbol_hash[:jaime429]; symbol_hash[:jaime5000] }

Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: